Hacker News
new
past
comments
ask
show
jobs
points
by
htrp
3 hours ago
|
comments
by
electronsoup
1 hours ago
|
[-]
Yeah MoE is a little worse for the same size, but you can often run bigger MoEs at respectable speeds even on cpu ram offload. The dense models really need to be 100% vram
reply