Hacker News
new
past
comments
ask
show
jobs
points
by
zkmon
16 hours ago
|
comments
by
zozbot234
16 hours ago
|
next
[-]
MoE is not a bad idea for local inference if you have fast storage to offload to, and this is quickly becoming feasible with PCIe 5.0 interconnect.
reply
by
perbu
14 hours ago
|
prev
|
[-]
MoE is excellent for the unified memory inference hardware like DGX Sparc, Apple Studio, etc. Large memory size means you can have quite a few B's and the smaller experts keeps those tokens flowing fast.
reply