undefined

They demoed today 8i running ate 1300 to 1600ish tokens per second. I imagine that is caused by having a single rack serving the model just for the demo.

by himata41136 hours ago|

parent|

[-]

There's a limit to how much you can "scale" this process, it's linear, but if we did napkin math based on vllm parallel batched streams only lose around ~50% performance compared to single-stream output so doesn't explain the ridicioulusly fast numbers here.

I wish google just came out and told us how large their flash model is, because if it's as big or smaller than gpt-5.4-nano that's the real headline here.

by Jabbles10 hours ago|

prev|

[-]

> Engineers at google have publically stated that the models are too big and are far from their potencial

Can you link to a source?

by himata41135 hours ago|

parent|

[-]

I wish I could, it was one of those youtube podcast type interviews with one of the engineers, there was a lot more shared, but that line stuck with me the most.

by Dinux10 hours ago|

prev|

[-]

Source please cause i dont believe that for once second

by maipen11 hours ago|

prev|

[-]

Don’t let that fool yourself. Google will have SOTA models as big as or even bigger than their competitors.

They are just refining their current models while they finish training the next generation.

They will all come out at about the same time. Anthropic, OpenAi, Google, xAI

by ACCount3711 hours ago|

parent|

[-]

Anthropic has been sitting on Mythos for a while now. I guess they don't feel pressured to fuck it ship it until anyone else gets a 10T to work.

by throwa35626210 hours ago|

parent|

[-]

According to people who have access to Mythos, it is slightly worse than GPT-5.5-xhigh. At least for security tasks.

Hold on, I think this claim needs some hard data. Here you go gentlemen:

https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5...

by aesthesia10 hours ago|

parent|

[-]

See the later post testing a newer Mythos checkpoint, though: https://www.aisi.gov.uk/blog/how-fast-is-autonomous-ai-cyber...

by throwa3562628 hours ago|

parent|

[-]

Fair enough

by ACCount3710 hours ago|

parent|

prev|

[-]

That claim keeps contradicted hard by other parties, who say Mythos beats 5.5 resoundingly on both autonomous search and discovery and creation of complex exploit chains.

There might be a harness difference, but also, this CTF-type benchmark might not capture the capability difference fully.

by nimchimpsky6 hours ago|

parent|

[-]

[dead]

by abirch10 hours ago|

parent|

prev|

[-]

Anthropic can sell Mythos to Fortune 500 companies and bypass the average user. I'm not sure how much is hype but I see things like this https://blog.cloudflare.com/cyber-frontier-models/

by Sevii11 hours ago|

parent|

prev|

[-]

It's doubtful they have the compute to make mythos publicly available even after the SpaceX datacenter deal. And why sell it publicly if people are still willing to pay for Opus 4.7?

by outside123410 hours ago|

parent|

prev|

[-]

I suspect that Mythos doesn't have a business model that works

by howdareme11 hours ago|

prev|

[-]

Google’s pro models are almost certainly bigger than Openai’s lol

by fikama10 hours ago|

parent|

[-]

Why would that be? I am curious why do you think that.

by mnicky9 hours ago|

parent|

[-]

E.g. because they are behind on research and so must compensate with size to achieve similar level of intelligence. At least this is what I heard.

For intelligence/size only OpenAI and Anthropic are the frontier. Google has more compute so it can compensate for that with size of the models...

by snovv_crash8 hours ago|

parent|

[-]

I'd argue Qwen is pushing the Pareto frontier considerably further when you take size into account.

by ActorNightly9 hours ago|

parent|

prev|

[-]

Because TPUs are more efficient, and its cheaper for them to field them in higher quantity since they own the chip.

by ActorNightly9 hours ago|

prev|

[-]

I mean, yes and no.

Nobody really knows the answer to which one is more optimal

* Large model trained on a large amount of data across multiple domains, that doesn't need any extra content to answer questions.

* Smaller model that is smart enough to go fetch extra relevant content, and then operate on essentially "reformatting" the context into an answer.