Hacker News
new
past
comments
ask
show
jobs
points
by
lkm0
5 days ago
|
comments
by
m_w_
5 days ago
|
[-]
I think Mythos is rumored to be ~10T parameters, so in this case I think the answer is yes, although I'm sure MoE, looped models, etc play a role in the improvements as well.
reply