Hacker News
new
past
comments
ask
show
jobs
points
by
sosodev
19 hours ago
|
comments
by
camdenreslink
17 hours ago
|
next
[-]
It could be. Or just smarter caching (which wouldn't necessarily have to do with model intelligence). Or just overfitting on the 95% most common prompts (which could save tokens but make the models less intelligent/flexible).
reply
by
energy123
18 hours ago
|
prev
|
next
[-]
Less cost to accomplish the same goal is a sign of intelligence. That's not necessarily achieved with less tokens but it may be.
reply
by
mchusma
18 hours ago
|
prev
|
[-]
Kind of? But I really care about price speed and quality. If it used 10x tokens at 1/10th the tokens and same latency I would be neutral on it.
Kimmi 2.6 for example seems to throw more tokens to improve performance (for better or worse)
reply