Hacker News
new
past
comments
ask
show
jobs
points
by
dannyw
15 hours ago
|
comments
by
andai
5 hours ago
|
[-]
It's a 119B model, 6B active.
That's still 3-10x smaller than the other models in that graph though (400B, 1T, 1.5T).
reply