Hacker News
new
past
comments
ask
show
jobs
points
by
dakolli
12 hours ago
|
comments
by
bmacho
12 hours ago
|
next
[-]
"447 / 6144 tokens" "Generated in 0.026s • 15,718 tok/s"
This is crazy fast. I always predicted this speed in ~2 years in the future, but it's here, now.
reply
by
Lalabadie
12 hours ago
|
prev
|
next
[-]
The full answer pops in milliseconds, it's impressive and feels like a completely different technology just by foregoing the need to stream the output.
reply
by
machiaweliczny
10 hours ago
|
prev
|
next
[-]
We need that for this chinese 3B model that think 45s for hello world but also solves math.
reply
by
FergusArgyll
12 hours ago
|
prev
|
[-]
Because most models today generate slowish, they give the impression of someone typing on the other end. This is just <enter> -> wall of text. Wild
reply