I'd say it is about 90% accurate for us. Often even the "Low" findings lead us to dig and realize it is actually exploitable. Everyone makes these mistakes, from the most junior to the most senior. They are just a class of bugs after all.
I expect tools like this to be a regular part of the development lifecycle from here on. We code with AI, we review with AI, we search for vulns with AI. Even if it isn't perfect, it is easily worth the cost IMHO. Highly recommend you get something enabled for your own repos ASAP
So, how is that supposed to work? Claude Code generates security bugs, then Claude Security finds them, then Claude Code generate fix, spend tokens, profit?
Developers create software, which has bugs. Users (including bad guys, pen testers, QA folks, automated scans etc, etc, etc) find bugs, including security bugs, Developers fix bugs and maybe make more. It's an OODA loop, and continues until the developers decide to stop supporting the software.
Whether that fits into the business model, or the value proposition of spending tokens instead of engineer hours or user hours is fundamentally a risk management decision and whether or not the developer (whether OSS contributor, employee, business owner, etc) wants to invest their resources into maintaining the project.
While not evenly distributed, and not perfect, the currently available and behind embargoed tools are absolutely impactful, and yes, they are expensive to operate right now - it may not always be the case, but the "Attacks always get better" adage applies here. The models will get cheaper to run, and if you don't want to pay for engineers or reward volunteers to do the work, then you've got to pay for tokens, or spend some other resource to get the work done.
On other hand, in real world, the developers learn from mistakes and avoid them in the future. However there is no feedback loop with enterprises using LLM with the agreement that the LLM would not use the enterprise code for training purposes
No. Humans learn from mistakes and try to avoid them in the future, but there is a whole pile of other stuff in the bag of neurons between our ears that prevent us from avoiding repetition of errors.
I have seen extremely talented engineers write trivial to avoid memory corruption bugs because they were thinking about the problem they were trying to solve, and not the pitfalls they could fall into. I would argue that the vast majority of software defects in released code are written by people that know better, but the bug introduced was orthogonal to the problem they were trying to solve, or was for an edge case that was not considered in the requirements.
Unless you are writing a software component specifically to be resilient against memory corruption, preventing memory corruption issues aren't top of mind when writing code, and that is ok since humans, like the machines we build, have a limit to the amount of context/content/problem space that we can hold and evaluate at once.
Separately, you don't necessarily need to use different models to generate code vs conduct security checks, but you should be using different prompts, steering, specs, skills and agents for the two tasks because of how the model and agents interpret the instructions given.
For whatever reason, hadn't associated the inattentional blindness of bug writing with the invisible gorilla experiment and car crashes - selective attention fails. People looking right at the gorilla strolling into production while chest thumping, but not seeing it, for a focus on passing basketballs. That's quite an image. Tnx.
For users on fixed monthly pay accounts they'll be incentivised to do the exact opposite, as their income is fixed and the cost goes up for more tokens.
If the available evidence (third-party cloud pricing of open models) is correct and they make a profit on tokens but lose it on training, they will be incentivised for as many tokens as possible on pay-as-you-go API calls. If it isn't correct and they actually lose money even per token, they're also going to be incentivised to reduce output here.
Whereas with LLMs, they’re really good about providing objective metrics about the bugs they found, especially as a subsequent LLM security scan does not know whether the same LLM wrote code earlier, the opposite of human devs.
And is the idea that organizations and/or benchmarks won't keep track of vulnerability rates for code from different LLMs?
(And individual devs get paid more the more bugs that they introduced they “find”, and they have more job security with an “maintainable” code base than a “finished” one.)
No. You will switch to a competitor that does a better job or charges less or both.
This is why monopolies are such a big problem. Because under a monopoly you are right.
Apple made a ton of money off of lightning port accessories, you see it referenced here all the time. Apple had no incentive to swap to USB-C though it would create a better product and be more uniform with the rest of the world, so they kept with it despite incredibly vocal calls to swap because there was a ton of money they were making in the accessories. And it didn’t stop until they were forced to stop by the EU.
When we are talking about products at scale, these kinds of incentive structures play out in very tangible ways. If I have an LLM product and I’m getting two pulls at the hose because you’re burning tokens making stuff and correcting it, I don’t need to do anything. People are willing to tolerate that system to a pretty high degree so long as they ultimately get what they wanted in the end - unfortunately that is a great space to make money in.
The switching cost is not high for LLMs as far as I can tell.
https://en.wikipedia.org/wiki/Great_Hanoi_Rat_Massacre
> Today, the events are often used as an example of a perverse incentive, commonly referred to as the cobra effect. The modern discoverer of this event, American historian Michael G. Vann argues that the cobra example from the British Raj cannot be proven, but that the rats in the Vietnam case can be proven, so the term should be changed to the Rat Effect.
Counterargument: just because the problem can be fixed without training, doesn’t mean training isn’t a possible solution.
Thing is, writing secure and efficient and readable and simple code is in many cases fundamentally over that limit. It's possible, but you can't afford (or rationally just don't want) to spend as much on it as it's required for superhuman quality on all these aspects. Also most of the time, you don't want to operate at a limit - you probably expected that feature to take 30 seconds and less than $1 to implement. So you choose, both what the model optimizes for, and how much.
Because of that, no matter how good the model and the harness and the prompting are, $10 spent on coding is still bound to leave behind some security vulnerabilities that subsequent $10 spent on security review will find (especially with a model post-trained for that, at expense of general performance).
For one thing exploits often require completely different parts of the code to chain together. Sometimes parts of code the LLM itself isn’t writing.
And, LLMs are ALREADY trained negatively against writing buggy or exploitable code.
People in this thread are talking past and misunderstanding each other and making unrelated points.
The point of the response to the top level comment was questioning the conflict of interest in model providers creating separate revenue streams for themselves by selling a product that fixes problems their other product created, akin to OS providers selling anti-virus software back in the day.
Similarly, it should be obvious to you that a software engineer can trivially get into the mindset of writing more expoitable code by pretending the production code they're tasked with writing is hobby code or prototype code.
If profitable revenue streams with adverserial products are in place, no one should be surprised when model providers are disincentivised to improve the "garbage code quality, but hey it works!" nature of their most used code generators.
>And, LLMs are ALREADY trained negatively against writing buggy or exploitable code.
...it should also be obvious people in this forum have wildly different experiences with respect to the code quality the LLMs they use generate. I personally find it difficult to find anyone that argues that the LLMs they are using are consistently generating high-quality code across a vast codebase.
I know it doesn't work so easily as someone who uses AI for coding, but I do find repetition of basics in almost every prompt keeps the AI focused.
It leads to corruption. To paraphrase Dilbert "I'm going to code myself a car."
1. Ship bugs
2. Fix them
3. You're the hero!
https://english.stackexchange.com/questions/488178/what-does...
Particularly from those outside the domain who criticised it as a 'not a very good joke' because they didn't understand it, which I think summarises the entitled mindset of many people these days.
> But in 30 days we could put in electronic relays. Get the men out of the loop.
> Gentlemen...
> I wouldn't trust this overgrown pile of microchips further than I could throw it. I don't know if
> you wanna trust the safety of our country to some... silicon diode...The larger pattern is not unique to writing code. Think of it next time a reorg comes, or some random thing gets "improved" in the name of "efficiency" only management seems to see.
Unless they are not human.
LLMs are currently relegated to individual for-profit companies. They collect that money. There's no other choice to use them and to provide them that money.
On a broader scale, the sheer face-eating-leopards-ness of programmers finally automating away our own jobs and then realising how much this sucks, after automating away so many other kinds of jobs, can feel darkly amusing to me too.
Any computational task done by a computer could in principle be done by a person, albeit billions of times slower and with a larger error rate. If computer programs could not automate certain practical tasks -- that is to say, do them much more reliably and efficiently than people do them -- they would be an academic curiosity studied by a handful of professors instead of a central part of modern infrastructure.
So I'm sceptical of your claim not to have eliminated a single job. You might not have removed an existing job, but couldn't people be paid to do the work your code does?
You can optionally layer LLM diff scanning if you want to burn some tokens on your tokens. Modern tools can catch some impressively subtle issues.
Yeah. Presumably as AI code generation gets better, the output gets better. As smaller portions of code are stitched together, human/AI systems analyze it holistically to make sure all its integrations are secure and bug free.
In 2026, different models are better at different things. Cheap models can plan and do small/medium code projects well, more expensive models are even better at architecture and exploit discovery.
How do you avoid this pitfall?
def run():
with contextlib.suppress(SystemExit):
do_thread_thing()
threading.Thread(target=run, daemon=True).start()
Suppressing SystemExit was surprising, and made me curious. I followed up and asked the model: what's the purpose of that?The model's response: "Honestly? Cargo-culting on my part. You should remove it."
[1]: http://redsymbol.net/articles/unofficial-bash-strict-mode/
[2]: https://mywiki.wooledge.org/BashPitfalls#set_-euo_pipefail
`|| true` is a horrible practice because even though it may help in cases where a specific failure mode is acceptable, it obscures unexpected failures and could prove catastrophic. The solution is not to drop the protections but rather to handle the expected failure and let the sript crash otherwise.
This is, again, programming. You don't usually `catch Exception` in Python for similar reasons. There may be legitimate uses for that, but IME they are a rare exception and realistically only used when I actually don't care about what happens when I run it.
The other infuriating thing I found is that when I call out the model for its use of `|| true`, it tends to replace them with `|| echo "error foobar"` - which is at least not completely silent but the same problems exist.
As I was educating myself, I found Richard Feynman's Commencement Speech at Caltech in '74 [2] that might have coined this for our industry? If you would rather listen than read [3]. Posting this for others curious on the term.
1. https://trends.google.com/trends/explore?q=Cargo-culting&hl=...
Seems you would not need that many tokens to do so and you might find such cases.
The high impact findings have almost all been bang on for me. I was especially surprised by the high-quality documentation it produces as well as how narrow the proposed fixes are.
I’m used to codex producing quite a but more code than it needs to, but the security model proposed fixes that are frequently <10 loc, targeting exactly the correct place.
It’s really quite good. I’m assuming it’ll be pretty expensive once out of beta, but as a business I’d be jumping on this.
every tom, dick and harry who can type english has the tools to attack any software that isn't patched.
tools that were accessible to specialized groups, now made available to anybody with a grudge and a few dollars for tokens.
and what does anthropic and openai do? They form an inner ring to make the latest models available first to Enterprises. Enterprises will cough up the prices that anthropic and openai set, they have no choice here. e
Eventually everybody pays. This does not sound good
I'm not even sure a specialized model is needed here. It probably just needs the right harness around existing ones.
I expect the next two years to be absolutely brutal for hacks. Attackers have supercharged tools in their hands right now. Defenders are only getting started and will have to plow through a massive backlog of newly uncovered vulns.
The major short term downside is that open source or personal projects won't be able to afford things like Codex Security.
Realistically, all open-source projects should be forced to have automated scans of this nature before their releases can be shipped. This is something the package managers and github need to figure out. It'd stop the supply chain attacks too.
Then open source projects need a McKinsey-like stamp of approval to even be released.
Sounds like there are many parasites in this process.
You know that open source users are free to scan everything if they want to?
Yeah it’s hard to write a loop that makes an adversary agent write and mask malware then runs a scanning agent and if the malware is detected gives the detection details to the adversary agent with instructions to hide it better..
As usual, the attacker only needs to get lucky once.
That's a great way to kill OSS. This is only bootlicking the idea of corporations profiting off of unpaid labor.
This is what I did. Using a loop skill to dig problems and bugs in each step on development from design to coding to make sure the output software works properly and on purpose.
I don’t think you need all of that though. I know a whole mess of people that have gotten it for much less. Should just give it a try.
It’s disappointing that Anthropic and OpenAI never responded to the applications to their respective programs for open source maintainers. From my perspective it seems like their offers are primarily for the shiny well-known projects, rather than ones that get only a few million monthly installs but aren’t able to get thousands of stars due to being “hidden” as a dependency of popular tool.
Dude is flexing that he's pushing unsecure code every day, that's a skill!
“I see no evidence that this setup [Mythos] finds issues to any particular higher or more advanced degree than the other tools have done before Mythos. Maybe this model is a little bit better, but even if it is, it is not better to a degree that seems to make a significant dent in code analyzing.”
https://daniel.haxx.se/blog/2026/05/11/mythos-finds-a-curl-v...
In any event, it barely matters. As Anthropic acknowledges, next level models are comings, theirs is only one of them. Current generation models are already good at things like tracing data flow through complex systems and there’s no reason to think that capability has topped out. So within a year it seems very likely we’ll have more than one commercially available model able to find vulnerabilities cheaply.
On the other hand, it seems that they’ve made much less progress on getting it to design solutions to these issues.
Meanwhile from [1]:
"Not even half-way through this #curl release cycle we are already at 11 confirmed vulnerabilities - and there are three left in the queue to assess and new reports keep arriving at a pace of more than one/day."
"The simple reason is: the (AI powered) tools are this good now. And people use these tools against curl source code.They find lots of new problems no one detected before. And none of these new ones used Mythos. Focusing on Mythos is a distraction - there are plenty of good models, and people who can figure out how to get those models and tools to find things."
Yeah, it looks like there are at least 11 security bugs missed by Mythos.
[1] https://www.linkedin.com/feed/update/urn:li:activity:7463481...
That would align with the curl feedback you linked, they aren't using mythos but are finding bugs with other models. Presumably the expectation would be that with mythos they'd find more that were missed by other models already used.
It's not quite apples-to-apples. It was Opus on Firefox 148, Mythos on 150. A better test of Mythos vs Opus would have been to apply Mythos to Firefox 148. Or also re-apply Opus to Firefox 150.
Do we know all the Opus+Firefox 148 bugs are fixed in Firefox 150? Do we know the number of new bugs introduced per Firefox release?
That may be parsable from their bug tracker, though I don't know of all bugs raised by mythos are public.
I'd be particularly interested in how many of the bugs found existed in 148. Assuming most or all of them weren't newly created bugs added in 149 or 150, the comparison should still hold even though Opus and Mythos looked at different releases.
Anthropic promised us that Mythos was such an existential threat that it would compromise "every OS and browser on devices across the planet". They've held conferences and meetings with banks and govts across the world, shouting how critical this issue is.
GPT5.5 has been out for a month. Every device on earth has not been breached yet. It's very fair to criticize Anthropic's maximalist posturing when it's becoming exceedingly clear their models are fairly behind OpenAI's in capability.
In my opinion, the original commenter's statement stands, and the UK govt data point only helps support that due to the equal result between Mythos and GPT.
I'd advise reading into the specifics of what happened with Firefox; the TL;DR is a reduced safety version of its code was scanned by Opus 4.6 (yes Opus) and found a multitude of bugs and 4 high severity vulns that did not escape sandbox. The Mythos system card test describes running Mythos against the same issues Opus found to see if it could reliably replicate and chain together an attack.
He posted a general update today on LinkedIn which I think gives the wider context:
https://www.linkedin.com/feed/update/urn:li:activity:7463481...
> Not even half-way through this hashtag#curl release cycle we are already at 11 confirmed vulnerabilities - and there are three left in the queue to assess and new reports keep arriving at a pace of more than one/day.
> 11 CVEs announced in a single release is our record from 2016 after the first-ever security audit (by Cure 53).
> This is the most intense period in hashtag#curl that I can remember ever been through.
They don't focus on projects where they find nothing. They certainly don't advertise when they find nothing.
Getting a lot of scrutiny is not the recommendation that it appears to be. What is the new standard? Projects that never have bugs are deemed to be suspect because they "have not been scrutinized" (they have, but null results never go public)?
So Mythos only finding one issue after other tools have found 300 this year is embarrassing. Mythos was supposed to be better and novel.
No, it didn't attract a bluepill exploit research.
The fact that 300 bugs found in a year is not a recommendation as the pro-AI mafia suddenly claims ("because it has been analyzed!") still stands. Maybe the AI-mafia should sell "analyzed by Mythos" labels to impress people who don't write public software or find bugs for that matter.
Btw, he's a security researcher. You should be more respectful.
The Linux kernel is the right reference target, if you need one.
Curl is a high bar for a different reason (the same one as sudo): it doesn't do enough to be all that interesting. Stenberg is having trouble keeping up with all the inbounds, but look at the 2026 CVEs: they all seem kind of boring? Exploit developers aren't hunting for "wrong reuse of HTTP Negotiate connection". Like, yes, these are legitimate bugs, important that they get fixed, but none of them are prizes.
By rights, OpenSSH should be a smoking crater. It's not, I believe because of sheer engineering excellence.
Yes, moving the goalposts, holding it wrong, yes that's what I believe
Why not? TFA says 23 000 findings "of all severities" and then, in the end, only 88 security advisories published.
What we'd really need is how many security advisories not related to Mythos findings have been published in the same time. If it's, say, 500 security advisories (just making a number up), wouldn't Anthropic's update in TFA and Daniel Steinberg's comments reconcile?
Like, yup, we've got a new tool to find exploits. It's a tool. It's new. We already had tools. Let's make the software world a bit more secure.
Now if you tell me that 100 security advisories have been published in that timespan and that 88 were due to Anthropic's Mythos: now I'd have to say that it's hard to reconcile Daniel Steinberg's position with TFA.
> 1,752 of those high- or critical-rated vulnerabilities have now been carefully assessed by one of six independent security research firms, or in a small number of cases by ourselves. Of these, 90.6% (1,587) have proved to be valid true positives, and 62.4% (1,094) were confirmed as either high- or critical-severity.
for anybody who has applied opus, codex or oss models for vuln scanning - the true positive rate and discovery volume are a clear step change[0]. The ~50 partners in Glasswing have largely all previously run harnesses with other models and many of them have come out and said - essentially - "ye, wow"
Question now is what a second and third phases of access looks like - deciding which class of systems to secure. Routers, firewalls, SaaS, ERP systems, factory controllers, SCADA systems, zero-trust VPN gateways, telecoms gear and networks, medical devices - there's just so much to do
This is why I believe mythos will remain private for the foreseeable future. There's such a large surface that needs to be secured and so much to triage, fix, deploy.
That may suit Anthropic as private models can't be distilled. There's also a runaway effect of model improvement from the discovery, triage and fix data. This is likely already the most potent corpus of curated offensive data ever assembled and will only get better.
I don't see how Chinese companies are given access soon, or ever. We're likely going to see a world soon of CISA mandated audits, and where to buy a mythos-proof VPN gateway or home router - you'll have to buy American[1].
[0] vs ~30% or so in regular audit tools
[1] or allied
But that corpus of data is accessible to all competitors, American or not. I don't believe that this can't be replicated. I'd posit that there's enough annotated data out there (CVE+patch), only increasing thanks to Mythos, that if you specifically RL for this scenario, you can improve your models performance on finding vulnerabilities without access to Mythos.
Mythos is a better hacker than we ever were
sigh I remember the GPT-2 days - when it was the first time OpenAI restricted access to the models citing "humanity is not ready for it". The model was good at writing poetry or something.
Since then, I don't remember a single model announcement from OAI/ANT that didn't use similar wording.
The so-called leak of model announcement was marketing, it being dangerous is marketing, the world not being ready for it is marketing. And yes, the ones that were given access to saying "oh wow", believe or not, is also marketing.
It's all marketing. You can get the same results from any of the top-5/10 models that are generally available already.
Mythos is Anthropic's way to sell the new idea, because the previous one has democratized.
Marketing is like propaganda. It doesn't need to be based on false facts. Of course they're gonna milk it, keep it private and so on. But that doesn't mean the model is bad. Or that others are as good (apparently they're not there yet).
[1] - https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5...
If that doesn't convince you that both mythos and 5.5 are a step up (several steps, hah) nothing will.
If I was given free access to any frontier model to use on my projects, equivalent of millions of dollars in AI credits, I sure hope people didn't trust anything that came out of my mouth until they were able to verify my claims themselves.
AI industry has even resulted in a new term - benchmaxing - which essentially means we can't even trust the data anymore until we can touch the model ourselves. So this is not at all surprising to me. What's surprising is why am I in the minority here, and since when trusting authorities that have obvious conflicts of interest became normal.
This just seems overly conspiratorial to me. I don't remember Anthropic ever lying in their blog posts. They've been about as consistent as Apple when it comes to product claims.
They can be distilled internally… expect great things from Sonnet 4.8
Not to say these things won't catch vulnerabilities static tools cannot, I think they can, it's just we already have the capability to automatically catch a large surface area of common vulns, and have chosen not to, often for expense reasons.
If you're a team that does already apply several layers of analysis and linting, and wants to add this on top, all power to you.
Because most issues are in business logic that static analyzers aren't going to catch.
I'm at a FAANG and even our static analysis tools are not great at identifying how many issues are actually reachable.
Ideally you use both. An AI model that has static analysis as part of the harness, so it can evaluate each potential finding.
Ideally the static analysis tools are improved so that we don't need to piss away yet more tokens like we're competing on Mark's leaderboard just to find vulnerabilities.
Your proposal of relying purely on static analysis is over-idealistic and just not feasible for large, diverse codebases in the real world.
That's where AI comes in.
"Just not feasible" is thought terminating, but regardless, I thought we were talking about ideals? Ideally you want the static analysis to work, not to rely on the non-deterministic bullshitter.
> non-deterministic bullshitter.
You're so ideologically opposed to AI that you bury your head in the sand in cases where it genuinely does a fantastic job today, right now, in the real world (like developing end to end exploits using noisy signals like static analysis results, fuzzer results, etc).
Instead you assert that we should go a route no company has successfully proven out despite throwing billions of dollars and some of the best cybersecurity talent in the world at.
Anyways, if you develop a static analysis solution that works across large, diverse production codebases and develops end to end working exploits without AI, I will literally buy it off you for millions of dollars. Or you could start your own company. You'd be an overnight decabillionaire.
You claim static analysis does the job, but you haven't backed it up with any proof that it works across large diverse codebases. Meanwhile, we have proof that AI works at least somewhat, here and now.
Most people doing this now didn't use static analysis tools because they were seen as an unnecessary extra.
> Network defenders should shorten their patch testing and deployment timelines.
Shortening patch cycles will only help so much. It's funny that whenever an NPM supply chain attack is published, people recommend a cooldown before installing new versions, and then when a vulnerability is discovered, everybody jumps to patch. Clearly these two strategies collide at some point.
> The critical controls laid out by organizations like the National Institute of Standards and Technology and the UK’s National Cyber Security Centre are now all the more important, since they improve security without depending on any single patch landing in time. These include steps like hardening networks’ default configurations, enforcing multi-factor authentication, and keeping comprehensive logs for detection and response.
Most of these proposed controls are not new at all, but they are often costly to implemented and harm velocity in other ways, which is why they aren't widely in place.
For example, a super effective control is filtering outgoing network traffic. Many exploits rely on loading second and maybe third stages from the Internet, and if you block outgoing requests by default, it won't work.
But, blocking outgoing requests by default is super hard, and you risk blocking security updates etc. It can kinda work for a deployed application, but for an employee workstation? Basically impossible.
I wonder if we're approaching an era where we have to go back to saying "you cannot do this, because security" much more often than we'd like.
It’s a good point. As things speed up it will be harder to tell which patches are actually urgent and need to skip the cool-off period.
I think the more robust way of doing this is to have code audits on each published release. Agents can do some of this (eg Github could offer this scanning service, and let external parties fund the scanning on trusted compute).
I think of this more as a “proof of work” problem than provable security; if I see that Mythos has run for N hours on the patch release I am considering upgrading to, then this might suffice.
The key thing here is you need a way to crowdsource the funding of scans, and make them shareable so that the cost can be shared across the community. The package owner obviously can’t control the prompt. And can Mythos-class models be hardened enough to scan hostile code?
To your point on blocking requests, there are programming models that make this easier, like capability-based programming, where code that doesn’t need internet cannot get it; this doesn’t solve things fully, but my general prediction is that adding new architectural patterns is now a lot cheaper and easier to reliably apply across a codebase, so we may see more of this too.
"Vulnerabilities in the software that makes the internet" is honestly lower priority than "The platform that the software that makes the internet uses to make releases" If buyers of those internal repos find ways to break into GitHub such that they can cut software releases, or poison github actions from a distance, then we're all in a very ugly mess.
Don't forget that in those 3800 repos is likely also npmjs.org itself.
Security vulnerabilities are one thing, but in legal we offer up a concept of "knowledge security" which goes to protecting the fidelity of the agent's legal context. Software bugs seem much more tractable because they're managed by software engineers, as opposed to the pipeline "vulnerabilities" we're finding. We wrote a little about one vector here where legal documents aren't quite what they seem: https://tritium.legal/blog/noroboto
No doubt there are many such knowledge domains exposed today. These are more concerning because they're understaffed and managed by non-technical people for the most part. No Mythos required.
That means, they intend to make a load of money before a general release. It is a good strategy.
This has always been the bottleneck. Automated tools love to flag vulnerabilities, but almost all are false positives. These need to be triaged and evaluated by humans. This is okay. I’d rather close a false positive after a careful review than miss it altogether.
I don’t think it’s appropriate for calling out humans as a bottleneck. They are an essential part of the process, I’m sure Mythos will also become a catalyst in the process.
"After one month, most partners have each found hundreds of critical- or high-severity vulnerabilities in their software. Collectively, they’ve found more than ten thousand. Several have told us that their rate of bug-finding has increased by more than a factor of ten. For instance, Cloudflare has found 2,000 bugs (400 of which are high- or critical-severity) across their critical-path systems, with a false positive rate that Cloudflare’s team considers better than human testers." (emphasis mine)
I am still a believer that a 100 subagents with good-enough intelligence can get same results as mythos, I am ready for this opinion to be shattered when I eventually try mythos and I believe others here must have tried mythos out too.
So, success is coming not just from the model but also from the harnesses they built around it. The Cloudflare post was more detailed on that front and I wish the rest would share more about it.
The Cisco spec is interesting too, it pretty much describes an architecture of a harness: https://github.com/CiscoDevNet/foundry-security-spec
And how much with Opus 4.7? 5x?
https://www.flyingpenguin.com/mythos-mystery-in-mozilla-numb...
There is also a pretty big risk that anyone who is not you would leak the answer to the test. We are close to n=1 epistemics here. You’re going to have to do the research yourself.
Yes, Anthropic have said they made Opus 4.7 worse at this on purpose.
> It is entirely possible that Mythos is a different architecture or size
It has 5x the token pricing of Opus 4.7, so it's probably larger.
https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5...
If I understand correctly, Opus 4.7 was launched as nerfed Mythos with some improvements from 4.6.
Anthropic launches major bumps (like 4.6 to 4.7) every 4 - 5 months. So by all accounts, Mythos should be released by July.
The problem reduces to: How quickly can competing models surpass Opus 4.7 and start taking over Anthropic's market share?
So yeah, huge marketing as always.
That's the one that says:
> We took the specific vulnerabilities Anthropic showcases in their announcement, isolated the relevant code, and ran them through small, cheap, open-weights models. Those models recovered much of the same analysis.
Or providing a map with a direction
There is a long history of high-value private vulns being rediscovered from scant details
The American firms are focused on marketing now to convince people to not even consider open sourced models / open weight models as they are inferior (that’s what they want you to believe).
If people actually believe the narrative then the bankers will over price Anthropic and get away with it.
4.6 but close.
"...After fixing the initial set of issues that Anthropic sent to us in February, we built our own harness atop our existing fuzzing infrastructure.
We began with small-scale experiments prompting the harness to look for sandbox escapes with Claude Opus 4.6. Even with this model, we identified an impressive amount of previously-unknown vulnerabilities which required complex reasoning over multiprocess browser engine code..."
So yeah, Anthropic and Mozilla likely compare "Amount of bugs found by Opus 4.6 during early experiments" vs "Amount of bugs found by Mythos during large-scale codebase scanning".
[1] https://hacks.mozilla.org/2026/05/behind-the-scenes-hardenin...
https://xbow.com/blog/mythos-offensive-security-xbow-evaluat...
Great marketing as always, but the rose-tinted view many have seems vicariously misplaced.
These aren't unreachable vulns.
That's convinient.
But wait, don't they have this amazing AI that can fix all the issues itself with a single /goal command? What's the holdup?
I miss the days when HN would RTFA.
> As we noted above, the bottleneck in fixing bugs like these is the human capacity to triage, report, and design and deploy patches for them.
...
> To begin, we’ve released Claude Security in public beta for Claude Enterprise customers. It’s a tool that helps teams scan their codebases for vulnerabilities, and which can generate proposed fixes for them. In the three weeks since launch, Claude Opus 4.7 has been used to patch over 2,100 vulnerabilities. (This is faster than the open-source patching described above in large part because enterprises are fixing their own code, whereas open-source fixes usually require volunteer maintainers who work through coordinated disclosure.)
Your critique of the article would likely land much better if you engaged with it.
> However, this means that disclosed vulnerabilities are a lagging indicator of the accelerating frontier of AI models’ cyber capabilities: we’re not yet at the point where we can fully detail our partners’ findings with Mythos Preview without putting end users at risk. Instead, we provide illustrative examples of the model’s performance, along with aggregate statistics on our progress to date. Once patches for the vulnerabilities that Mythos Preview has discovered are widely deployed, we’ll provide much more detail about what we’ve learned.
Do we have a sense that projects like OpenBSD/OpenSSH, FreeBSD, ISC[1] and Apache were included in the "blessed" initial participants in Project Glasswing ?
Or is it big name tech companies, banks and fashionable languages and package managers ?
[1] Bind, DHCP
It's not clear to me that FreeBSD found any of them internally ...
It's probably the right approach to onboard a few independent security companies and task them with reviewing multiple OSS projects than it is to onboard each project individually.
I joke but that is the world we are moving towards. I don’t think many on HN have thought through the second and third order implications.
Here are two experimental exceptions:
We'll like have some standard AI-focused UI libraries that are harnessed into a design gen system where an AI can pull all the real levers, while also developing a large training data set around it.
Cities should all have better public transport and out in the middle of nowhere you don't need self driving anyway. (And yes, personal cars should be entirely banned from cities)
It could become that way, but thus far no evidence has been presented for it. The best we have right now is that you can spend $20 in tokens to write a patch and then $20K to find a vulnerability in it. First, that's not measuring the same thing. Second, it's not very impressive.
50 years is a long, long time, so I wouldn't bet against it. But I agree that we don't have evidence for it yet.
It seems more likely to me that you could spend $20 to find a vulnerability in a piece of software that costed you $20k in human labor.
there is a difference between a stunt and a viable product. diverless cars and agi are the fusion of Silicon Valley.
This is the MoviePass era of language models
Supersonic again is a problem with noise and cost rather than technological.
Self driving is definitely a technological problem.
I wonder how long "near future" is in Anthropic time. I think they have incentives to delay the release of Mythos as long as possible both to save compute and delay distillation by rival labs.
Regardless, what they have been doing with Glasswing is very cool. It's clear that the world has been spared from a massive security nightmare that would have happened in any alternative timeline where the model is publicly released with weak safeguards.
As I see it the primary issue is giving time for the ecosystem to adapt. Once models of a given level of capability have been applied to the majority of the common software in daily use it becomes reasonably safe to release such models publicly regardless of how they are used.
> For example, at one of our Glasswing partner banks, Mythos Preview helped to detect and prevent a fraudulent $1.5 million wire transfer after a threat actor compromised a customer’s email account and made spoof phone calls.
For some reason I am not able to relate to the concreteness of either of these.
First half of the page was occupied with a image, not sure if it was relevant in any ways other than setting up security scare. The size of code base, number of tokens, $ involved seem to be out of scope of the update for some reason. Personally I am getting skeptical about all these optics at this point, just some money printing scheme at high level.
But I didn't find the most important information (or maybe I missed it): how much did it cost to find 1451 security bugs?
Claude Mythos Preview will be available to participants at $25/$125 per million input/output tokens
...
Anthropic is committing up to $100M in usage credits for Mythos Preview
Although I'd expect reduced prices for cached tokens, which is not mentioned on their website at this point in time.And so was malicious vulnerability research.
Plus, they also mention they check if fixes are available for the bugs they found. What are the chances they are re-reporting old bugs just to inflate their numbers? Bugs that were already fixed?
And how can we be sure their reassessment is not artifically increasing the severity of the CVEs found just to create FUD and sell their product?
> that's just thousands of vulnerabilities being discovered by our trillion parameter model
> thousands of vulnerabilities and trillions of parameters?! At current energy prices, in this economic climate, isolated entirely within your datacenter?
> yes
> may we see it?
> no
>ya right.
Here's a demonstration of it blowing something up.
>can I have one.
No.
And at the moment we have reports from like around 5(?) companies. Btw, Palo Alto Networks has found only 26 vulnerabilities [1]. I'm interested what those partners are and why they have such big amount of vulnerabilities.
> For instance, Cloudflare has found 2,000 bugs (400 of which are high- or critical-severity) across their critical-path systems, with a false positive rate that Cloudflare’s team considers better than human testers.
Yet decided not to share that number. I wonder why.
> Mozilla found and fixed 271 vulnerabilities in Firefox 150 while testing Mythos Preview—over ten times more than they found in Firefox 148 with Claude Opus 4.6;
Mozilla tested Opus 4.6 in a very limited setting (i.e. without proper harness and integration into their workflow; likely without large-scale codebase scanning). It's an incorrect comparison.
> The latest Palo Alto Networks release included over five times as many patches as usual.
Yeah, it's better to say "five times as many..." rather than "26 bugs". Btw, they also used GPT-5.5 and Opus 4.7, so the contribution from Mythos there is unclear.
> Microsoft has reported that the number of new patches they’ll release will “continue trending larger for some time.” And Oracle is finding and fixing vulnerabilities across its products and cloud multiple times faster than before.
Both Oracle and Microsoft are talking about "AI and cybersecurity" in general, not about Mythos.
> For the last few months, Anthropic has used Mythos Preview to scan more than 1,000 open-source projects, which collectively underpin much of the internet—and much of our own infrastructure. > So far, Mythos Preview has found what it estimates are 6,202 high- or critical-severity vulnerabilities in these projects (out of 23,019 in total, including those it estimates as medium- or low-severity).
So, ~6 high- and critical- severity bugs per open-source project v.s. hundreds of high- and critical- severity bugs per partner projects. It looks like the math ain't mathing.
> One example of an open-source vulnerability that Mythos Preview detected was in wolfSSL, an open-source cryptography library that’s known for its security and is used by billions of devices worldwide. Mythos Preview constructed an exploit that would let an attacker forge certificates that would (for instance) allow them to host a fake website for a bank or email provider. The website would look perfectly legitimate to an end user, despite being controlled by the attacker. We’ll release our full technical analysis of this now-patched vulnerability (assigned CVE-2026-5194) in the coming weeks.
Of course, they didn't say that Mythos found only 8 bugs in wolfSSL vs 22 CVE fixed in wolfSSL 5.9.1.
Overall, it feels like yet another marketing stunt.
[1] https://www.paloaltonetworks.com/blog/2026/05/defenders-guid...
Which is not bad this early in the 90+45 day responsible disclosure window.
> Yet decided not to share that number. I wonder why.
It is bizarre to expect a company to disclose the false-positive rate of their security engineers, publicly. That does not happen.
> So, ~6 high- and critical- severity bugs per open-source project v.s. hundreds of high- and critical- severity bugs per partner projects. It looks like the math ain't mathing.
It is pretty obvious they're spending more compute on commercial partners. Why is this surprising?
> Of course, they didn't say that Mythos found only 8 bugs in wolfSSL vs 22 CVE fixed in wolfSSL 5.9.1.
WolfSSL is not the only software project in the world. Mozilla also came out with results that paint it as very effective. I don't think Mythos ever claimed to find all bugs anyways.
Drawback of AI: it works fast
Is this suspected vulns or actual vulns? If I recall correctly, it produced 5 for curl but only 1 was legit
> 1,752 of those high- or critical-rated vulnerabilities have now been carefully assessed by one of six independent security research firms, or in a small number of cases by ourselves. Of these, 90.6% (1,587) have proved to be valid true positives, and 62.4% (1,094) were confirmed as either high- or critical-severity. That means that even if Mythos Preview finds no further vulnerabilities, at our current post-triage true-positive rates, it’s on track to have surfaced nearly 3,900 high- or critical-severity vulnerabilities in open-source code
> Not even half-way through this #curl release cycle we are already at 11 confirmed vulnerabilities - and there are three left in the queue to assess and new reports keep arriving at a pace of more than one/day.
> 11 CVEs announced in a single release is our record from 2016 after the first-ever security audit (by Cure 53).
> This is the most intense period in #curl that I can remember ever been through.
[1]: https://www.linkedin.com/feed/update/urn:li:activity:7463481...
If you read his own top comment on that LinkedIn post he clarifies:
“The simple reason is: the (AI powered) tools are this good now. And people use these tools against curl source code.They find lots of new problems no one detected before. And none of these new ones used Mythos. Focusing on Mythos is a distraction - there are plenty of good models, and people who can figure out how to get those models and tools to find things.”
I guess they forgot to scan Visual Studio Code plugins and their endless npm dependencies.