Hacker News
new
past
comments
ask
show
jobs
points
by
zozbot234
5 hours ago
|
comments
by
yorwba
4 hours ago
|
next
[-]
It's feasible to put the expert routing logic in a previous layer. People have done it:
https://arxiv.org/abs/2507.20984
reply
by
snovv_crash
5 hours ago
|
prev
|
[-]
Manually no. It would have to be learned, and making the expert selection predictable would need to be a training metric to minimize.
reply
by
zozbot234
5 hours ago
|
parent
|
[-]
Making the expert selection more predictable also means making it less effective. There's no real free lunch.
reply