undefined

points

[-]

To me, it is a very good data point.

Curl uses all sorts of tools, including AI tools to find bugs. These tools, according to the article found hundreds of bugs including a dozen CVE.

Mythos found one vulnerability. It means the Mythos is just another tool, not the revolution it claims to be.

It is common that when a new tool is introduced that a bunch of bugs are found, with diminishing returns. Mythos finding one vulnerability is consistent to what I would expect for a major update to an existing tool, which Mythos is over existing LLM-based solutions.

by thombles1 hours ago|

parent|

[-]

The question is how many security vulnerabilities are actually left in the code after all the recent AI attention. Either Mythos is a nothingburger, or it's substantially more powerful but there's nothing left to do. Even a large amount of C can be correct eventually. Curl has the _potential_ to become a good data point maybe 6-12 months from now - if researchers and new tools find many more vulnerabilities then Mythos is proved to be hype. If they don't, then maybe Mythos is overkill for today's curl and its capabilities are better deployed elsewhere (like Firefox, apparently).

by GuB-4229 minutes ago|

parent|

[-]

I have a hard time believing that Mythos found the only remaining Curl vulnerability. It is possible, but highly improbable.

And it is not overkill, the proof is that it found that vulnerability. It is like saying the new version of some static analyzer with some new rules is "overkill" because it only found only one more bug than the previous version. Deciding whether it is overkill or not is more about context. Using a very expensive model like Mythos for some little used non-critical software is overkill, but for Curl, it absolutely isn't.

If Mythos found loads of vulnerabilities in Firefox but not in Curl, I wouldn't say that's because of Mythos is so good, but rather that with the release of Mythos, they did some testing that could have been done before using the same tools Curl have used.

by thombles18 minutes ago|

parent|

[-]

We will see. As for "testing that could have been done before", Mozilla's posts indicate otherwise. Use of Opus 4.6 led to 22 security-sensitive bugs vs Mythos' 271 (https://blog.mozilla.org/en/privacy-security/ai-security-zer...). They already had the methodology in place when the more powerful model came along (https://hacks.mozilla.org/2026/05/behind-the-scenes-hardenin...):

> Once the end-to-end pipeline is in place, it’s trivial to swap in different models when they become available. Building this pipeline early helped us find a number of serious bugs using publicly-available models, and it also helped us hit the ground running when we had the opportunity to evaluate Claude Mythos Preview. In our experience, model upgrades increase the effectiveness of the entire pipeline: the system gets simultaneously better at finding potential bugs, creating proof-of-concept test cases to demonstrate them, and articulating their pathology and impact.

by spongebobstoes1 hours ago|

prev|

[-]

that makes it a good data point, because it is better able to illustrate the incremental capabilities of Mythos compared to previous tooling

that helps us to understand how much of Mythos is hype and how much is real

by 20k1 hours ago|

prev|

[-]

We see this exact hypetrain every time a new model is released. Mythos simply hasn't lived up to the "we're all gunna die from the flood of vulnerabilities" hype even slightly. Its slightly better than previous models by all accounts, cool stuff

I've seen literally near word-for-word this exact chain of events multiple times previously