Hacker News
new
past
comments
ask
show
jobs
points
by
lostmsu
9 hours ago
|
comments
by
rayboy1995
7 hours ago
|
[-]
Thanks!! I had disabled that previously while debugging, I can confirm this is helping accuracy from what I can tell so far. (And speed since the cache is preserved more often!)
reply
by
satvikpendem
5 hours ago
|
parent
|
[-]
Use the MTP models which 2x token generation speed, for example:
https://unsloth.ai/docs/models/qwen3.6#mtp-guide
reply
by
rayboy1995
14 minutes ago
|
parent
|
[-]
Very interesting I'll have to check this out thank you. This is why I love HN.
reply