Hacker News
new
past
comments
ask
show
jobs
points
by
cma
5 hours ago
|
comments
by
gunalx
3 hours ago
|
[-]
They probably use it on all models. Fast is probably just a resource pool with less congestion and therefore faster throughput per user but less efficent.
reply
by
cma
3 minutes ago
|
parent
|
next
[-]
If it speeds prefill too I guess so.
reply
by
6 minutes ago
|
parent
|
prev
|
[-]
deleted
reply