Is not
We sent open weight models against a codebase to find vulnerabilities.
In that sense: The AISLE replication still provides too much information to the model, but its not far off, and others have replicated Mythos' findings in a more clandestine manner on open source models. Some were totally capable of finding the same vulns Mythos found back in ~March (and today, the new Kimi K2.7 is looking extremely good, very little doubt it could do it).
The critical difference is that post-processing: the Mythos model/harness has some step to induce Mythos to actually exploit the vulnerability, leveraging its ability to do so as a ranking mechanism. Anthropic inferred that this led Mythos to discover vulnerabilities nothing else could discover, which is not true, and Anthropic should be held accountable for this weird artifact of that communication. However:
- An OSS model might find the vulnerability but rank it as a 3/10. Mythos finds it, chains it with a second vulnerability, now suddenly its an 8/10.
- An OSS model might find the vulnerability, alongside fifty other vulnerabilities. The operator ignores all of them.
The problem with automated vulnerability detection, including with LLMs, is that they find the haystack, not the needle. Every piece of hay might be a vulnerability, but whether its worthy of fixing is another matter. Mythos does represent a meaningful improvement; it better finds the needle.
This was the primary reason for not releasing it. The difference in the two primary camps around this topic are that the doubter group thinks that Anthropic, and all of their partners, are essentially lying about this (since no one outside Anthropic and select partners has the access to replicate), whereas the other side believes that Anthropic and partners are probably mostly telling the truth without too much exaggeration.
Neither camp has evidence other than unconfirmable reports and/or arguments about economic incentives. I personally think that Anthropic has, in the past, mostly not lied about things like this and has by far been the most transparent and open AI company. That could change, and they could be lying now, but I think that the camp that is certain that they are is far, far too confident in their belief.