Many labs aren't able to keep up with the frontier, xAI, Mistral
It’s like someone negotiating by saying, “I’ll waste even MORE money to build something worse if you don’t give me a deal.”
I’m not discounting there may be other advantages to doing it. I just don’t think negotiating is one.
Do we have data to substantiate that claim?
Both Spud and Mythos can also scale via inference time compute.
Meta simply did not have enough compute online, long enough ago, to have a similar PT.
Do we have data to substantiate that claim?