upvote
xAI just released Grok 4.20 beta yesterday or day before?
reply
Musk said Grok 5 is currently being trained, and it has 7 trillion params (Grok 4 had 3)
reply
My understanding is that all recent gains are from post training and no one (publicly) knows how much scaling pretraining will still help at this point.

Happy to learn more about this if anyone has more information.

reply
You gain more benefit spending compute on post-training than on pre-training.

But scaling pre-training is still worth it if you can afford it.

reply