upvote
I'm pretty optimistic that not only does this clean up a lot of vulns in old code, but applying this level of scrutiny becomes a mandatory part of the vibecoding-toolchain.

The biggest issue is legacy systems that are difficult to patch in practice.

reply
I could see some of these corps now being able to issue more patches for old versions of software if they don't have to redirect their key devs onto prior code (which devs hate). As you say though, in practice it is hard to get those patches onto older devices.

I'm looking at you, Android phone makers with 18 months of updates.

reply
Yeah but who pays the enormous cost?
reply
I imagine that some levels of patching would be improving as well, even as a separate endeavor. This is not to say that legacy systems could be completely rewritten.
reply
Wait. Wasn't AI supposed to alleviate the burden of legacy code?!
reply
If we have the source and it's easy to test, validate, and deploy an update - AI should make those easier to update.

I am thinking of situations where one of those aren't true - where testing a proposed update is expensive or complicated, that are in systems that are hard to physically push updates to (think embedded systems) etc

reply
Legacy code, not the running systems powered by legacy code
reply
If you’re still an AI skeptic at this point, I don’t know what sort of advancement could convince you that this is happening.
reply
Most vulnerabilities seem to be in C/C++ code, or web things like XSS, unsanitized input, leaky APIs, etc.

Perhaps a chunk of that token spend will be porting legacy codebases to memory safe languages. And fewer tokens will be required to maintain the improved security.

reply
I think most vulnerabilities are in crappy enterprise software. TOCTOU stuff in the crappy microservice cloud app handling patient records at your hospital, shitty auth at a webshop, that sort of stuff.

A lot of these stuff is vulnerable by design - customer wanted a feature, but engineering couldnt make it work securely with the current architecture - so they opened a tiny hole here and there, hopefully nobody will notice it, and everyone went home when the clock struck 5.

I'm sure most of us know about these kinds of vulnerabilities (and the culture that produces them).

Before LLMs, people needed to invest time and effort into hacking these. But now, you can just build an automated vuln scanner and scan half the internet provided you have enough compute.

I think there will be major SHTF situations coming from this.

reply
Yeah. Crufty cobbled together enterprise stuff will suffer some of the worst. But this will be a great opportunity for the enterprise software services economy! lol.

I honestly see some sort of automated whole codebase auditing and refactoring being the next big milestone along the chatbot -> claude code / codex / aider -> multi-agent frameworks line of development. If one of the big AI corps cracks that problem then all this goes away with the click of a button and exchange of some silver.

reply
I think we’re starting to glimpse the world in which those individuals or organizations who pigheadedly want to avoid using AI at all costs will see their vulnerabilities brutally exploited.
reply
Yep, it's this. The laggards are going to get brutally eviscerated. Any system connected to the internet is going to be exploited over the next year unless security is taken very seriously.
reply
lol and what about the vibe coders?

You people are comical. Why do you feel the need to create so much hype around what you say? Did you not get enough attention as a kid?

reply
Depends - do you think people are good at keeping their fridge firmware up-to-date?
reply
I’m good at keeping my fridge off the internet.
reply
You are the exception.
reply
Software security heavily favors the defenders (ex. it's much easier to encrypt a file than break the encryption). Thus with better tools and ample time to reach steady-state, we would expect software to become more secure.
reply
Software security heavily favours the attacker (ex. its much easier to find a single vulnerability than to patch every vulnerability). Thus with better tools and ample time to reach steady-state, we would expect software to remain insecure.
reply
If we think in the context of LLMs, why is it easier to find a single vulnerability than to patch every vulnerability? If the defender and the attacker are using the same LLM, the defender will run "find a critical vulnerability in my software" until it comes up empty and then the attacker will find nothing.

Defenders are favored here too, especially for closed-source applications where the defender's LLM has access to all the source code while the attacker's LLM doesn't.

reply
You also need to deploy the patch. And a lot of software doesn't have easy update mechanisms.

A fix in the latest Linux kernel is meaningless if you are still running Ubuntu 20.

reply
That generally makes sense to me, but I wonder if it's different when the attacker and defender are using the same tool (Mythos in this case)

Maybe you just spend more on tokens by some factor than the attackers do combined, and end up mostly okay. Put another way, if there's 20 vulnerabilities that Mythos is capable of finding, maybe it's reasonable to find all of them?

reply
From the red team post https://red.anthropic.com/2026/mythos-preview/

"Most security tooling has historically benefitted defenders more than attackers. When the first software fuzzers were deployed at large scale, there were concerns they might enable attackers to identify vulnerabilities at an increased rate. And they did. But modern fuzzers like AFL are now a critical component of the security ecosystem: projects like OSS-Fuzz dedicate significant resources to help secure key open source software.

We believe the same will hold true here too—eventually. Once the security landscape has reached a new equilibrium, we believe that powerful language models will benefit defenders more than attackers, increasing the overall security of the software ecosystem. The advantage will belong to the side that can get the most out of these tools. In the short term, this could be attackers, if frontier labs aren’t careful about how they release these models. In the long term, we expect it will be defenders who will more efficiently direct resources and use these models to fix bugs before new code ever ships. "

reply
This is only true if your approach is security through correctness. This never works in practice. Try security through compartmentalization. Qubes OS provides it reasonably well.
reply
I don't think this is broadly true and to the extent it's true for cryptographic software, it's only relatively recently become true; in the 2000s and 2010s, if I was tasked with assessing software that "encrypted a file" (or more likely some kind of "message"), my bet would be on finding a game-over flaw in that.
reply
This came across as so confident that I had a moment of doubt.

It is most definitely an attackers world: most of us are safe, not because of the strength of our defenses but the disinterest of our attackers.

reply
There are plenty of interested attackers who would love to control every device. One is in the white house, for example.
reply
You'd think they would have used this model to clean up Claude's own outage issues and security issues. Doesn't give me a lot of faith.
reply
I'm more curious as to just how fancy we can make our honey pots. These bots arn't really subtle about it; they're used as a kludge to do anything the user wants. They make tons of mistakes on their way to their goals, so this is definitely not any kind of stealthy thing.

I think this entire post is just an advertisement to goad CISOs to buy $package$ to try out.

reply