upvote
It’s a 6bn model. Totally different class. I’m more excited about “frontier small language models” tbh.
reply
It's a 119B model, 6B active.

That's still 3-10x smaller than the other models in that graph though (400B, 1T, 1.5T).

reply
Agreed, though open weights + relatively small is still headline worthy. This thing really cooks.
reply