If these Gemini 3.5 numbers are accurate, then I'd wager GPT 5.5 and Opus 4.7 are a lot smaller than people have speculated, too. It's not that frontier labs can't create a 5T+ parameter model, but they don't have the data to optimize a model of that size.
Gemini 3.5 Flash is really smart in one-shot coding reasoning, btw. Near the frontier. But it doesn't do so well in long horizon agentic tasks with arbitrary tool availability. This is a common theme with Google models, and the opposite of what we see with Chinese models (start dumb, iterate consistently toward a smart solution).
Data at https://gertlabs.com/rankings
I mitigate it by creating dense planning docs for everything and executing iteratively.
Lot's of time wasted on procedure unfortunately
Mythos is an exception that's larger.
I think it’s pure economics. Flash models are OP for the price, leads to too much demand, google cannot serve it. This is likely expensive to reduce load and hey, if it still makes money just keep the margin.
It’s not a rumor - there are many public announcements about $B deals around compute for other Ai companies
With the Pro variant being around 600B - 800B
My testing is comparing it's performance / output to other models in the same size range, so not as scientific as yours.