upvote
Given the cost increase associated with this model, and previous model releases, I think the size is trending upwards, not down.
reply
The speed says otherwise. I think they're increasing costs since they want to start seeing ROI.
reply
Those are (mostly) new, faster TPU
reply
latest TPU's appear to reach 800tok/s rather than the advertised 300tok/s.
reply
They demoed today 8i running ate 1300 to 1600ish tokens per second. I imagine that is caused by having a single rack serving the model just for the demo.
reply
There's a limit to how much you can "scale" this process, it's linear, but if we did napkin math based on vllm parallel batched streams only lose around ~50% performance compared to single-stream output so doesn't explain the ridicioulusly fast numbers here.

I wish google just came out and told us how large their flash model is, because if it's as big or smaller than gpt-5.4-nano that's the real headline here.

reply
> Engineers at google have publically stated that the models are too big and are far from their potencial

Can you link to a source?

reply
I wish I could, it was one of those youtube podcast type interviews with one of the engineers, there was a lot more shared, but that line stuck with me the most.
reply
Source please cause i dont believe that for once second
reply
Don’t let that fool yourself. Google will have SOTA models as big as or even bigger than their competitors.

They are just refining their current models while they finish training the next generation.

They will all come out at about the same time. Anthropic, OpenAi, Google, xAI

reply
Anthropic has been sitting on Mythos for a while now. I guess they don't feel pressured to fuck it ship it until anyone else gets a 10T to work.
reply
According to people who have access to Mythos, it is slightly worse than GPT-5.5-xhigh. At least for security tasks.

Hold on, I think this claim needs some hard data. Here you go gentlemen:

https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5...

reply
See the later post testing a newer Mythos checkpoint, though: https://www.aisi.gov.uk/blog/how-fast-is-autonomous-ai-cyber...
reply
Fair enough
reply
That claim keeps contradicted hard by other parties, who say Mythos beats 5.5 resoundingly on both autonomous search and discovery and creation of complex exploit chains.

There might be a harness difference, but also, this CTF-type benchmark might not capture the capability difference fully.

reply
[dead]
reply
Anthropic can sell Mythos to Fortune 500 companies and bypass the average user. I'm not sure how much is hype but I see things like this https://blog.cloudflare.com/cyber-frontier-models/
reply
It's doubtful they have the compute to make mythos publicly available even after the SpaceX datacenter deal. And why sell it publicly if people are still willing to pay for Opus 4.7?
reply
I suspect that Mythos doesn't have a business model that works
reply
Google’s pro models are almost certainly bigger than Openai’s lol
reply
Why would that be? I am curious why do you think that.
reply
E.g. because they are behind on research and so must compensate with size to achieve similar level of intelligence. At least this is what I heard.

For intelligence/size only OpenAI and Anthropic are the frontier. Google has more compute so it can compensate for that with size of the models...

reply
I'd argue Qwen is pushing the Pareto frontier considerably further when you take size into account.
reply
Because TPUs are more efficient, and its cheaper for them to field them in higher quantity since they own the chip.
reply
I mean, yes and no.

Nobody really knows the answer to which one is more optimal

* Large model trained on a large amount of data across multiple domains, that doesn't need any extra content to answer questions.

* Smaller model that is smart enough to go fetch extra relevant content, and then operate on essentially "reformatting" the context into an answer.

reply