Hacker News
new
past
comments
ask
show
jobs
points
by
zozbot234
7 hours ago
|
comments
by
Aurornis
4 hours ago
|
[-]
It depends on the model and the mix. For some MoE models lately it’s been reasonably fast to offload part of the processing to CPU. The speed of the GPU still contributes a lot as long as it’s not too small of a relative portion of compute.
reply