(github.com)
That said, given how things are I wouldn't be surprised if you could let Claude or similar have a go at the source code of the kernel or core services, armed with some VMs for the try-fail iteration, and get it pumping out CVEs.
If not now, then surely not in a too distant future.
[1]: https://www.freebsd.org/security/advisories/FreeBSD-SA-26:08...
I recently directed chatgpt, through the web interface, to create a firefox extension to obfuscate certain HTTP queries and was denied/rebuffed because:
"... (the) system is designed to draw a line between privacy protection and active evasion of safeguards."
Why would this same system empower fuzzing of a binary (or other resource) and why would it allow me to work toward generating an exploit ?
Do the users just keep rephrasing the directive until the model acquiesces ? Or does the API not have the same training wheels as the web interface ?
The answer is complex, worth watching the video. But mainly, they don't know where to place the line. Defenders need tools, as good as attackers. Attackers will jailbreak models, defender might not, it's the safeguard positive in that case? Carlini actively asks the audience and community for "help" in determining how to proceed basically.
1. Unit testing is (somewhat) dead, long live simulation. Testing the parts only gets you so far. These tests are far more durable, independent artifacts (read, if you went from JS to rust, how much of your testing would carry over)
2. Testing has to be "stand alone". I want to be able to run it from the command line, I want the output to be wrapper so I can shove the output on a web page, or dump it into an API (for AI)
3. Messages (for failures) matter. These are not just simple what's broken, but must contain enough info for context.
4. Your "failed" tests should include logs. Do you have enough breadcrumbs for production? If not, this is a problem that will bite you later.
5. Any case should be an accumulation of state and behavior - this really matters in simulation.
If you have done all the above right and your tool can return all the data, dumping the output into the cheapest model you can find and having it "Write a prompt with recommendations on a fix (not actual code, just what should be done beyond "fix this") has been illuminating.
Ultimately I realized that how I thought about testing was wrong. Its output should be either dead simple, or have enough information that someone with zero knowledge could ramp up into a fix on their first day in the code base. My testing was never this good because the "cost of doing it that way" was always too high... this is no longer the case.
Is that a good thing or bad?
I see that as a very good thing. Because you can now inexpensively find those CVEs and fix them.
Previously, finding CVEs was very expensive. That meant only bad actors had the incentive to look for them, since they were the ones who could profit from the effort. Now that CVEs can be found much more cheaply, people without a profit motive can discover them as well--allowing vulnerabilities to be fixed before bad actors find them.
Not all CVEs are the same, some aren't important. So it really depends on what gets founds as a CVE. The bad part is you risk a flood a CVEs that don't matter (or have already been reported).
> That meant only bad actors had the incentive to look for them
Nah. Lot's of people look for CVEs. It's good resume fodder. In fact, it's already somewhat of a problem that people will look for and report CVEs on things that don't matter just so they can get the "I found and reported CVE xyz" on their resume.
What this will do is expose some already present flaws in the CVE scoring system. Not all "9"s are created equal. Hopefully that leads to something better and not towards apathy.
There are some extreme cases that might require extensive code changes, and those would benefit from LLMs. But a lot of the issues are things like off by one issues with pointers.
Claude was used to find the bug in the first place though. That CVE write-up happened because of Claude, so while there are some very talented humans in the loop, Claude is quite involved with the whole process.
Do you have a link to that? A rather important piece of context.
Wasn't trying to downplay this submission the way, the main point still stands:
But finding a bug and exploiting it are very different things. Exploit development requires understanding OS internals, crafting ROP chains, managing memory layouts, debugging crashes, and adapting when things go wrong. This has long been considered the frontier that only humans can cross.
Each new AI capability is usually met with “AI can do Y, but only humans can do X.” Well, for X = exploit development, that line just moved.
It was a quote from your own link from the initial post?
https://www.freebsd.org/security/advisories/FreeBSD-SA-26:08...
> Credits: Nicholas Carlini using Claude, Anthropic
Would have been interesting with a write-up of that, to see just what Claude was used for.
https://www.youtube.com/watch?v=1sd26pWhfmg
https://securitycryptographywhatever.com/2026/03/25/ai-bug-f...
It pretty much is just "Claude find me an exploitable 0-day" in a loop.
https://www.youtube.com/watch?v=1sd26pWhfmg
Claude is already able to find CVEs on expert level.
The chance this is completly fabricated though is very low and its an highly interesting signal to many others.
There was also a really good AI CTF Talk at 39c3 hacker conference just 4 month ago.
Does it fix them as fast as it finds them? Bonus if it adds snarky code comments
For this kind of fuzzing llms are not bad.
I have gotten absolutely incredible results out of it. I have had a few hallucinations/incorrect analyses (hard to tell which it was), but in general the results have been fantastic.
The closest I've come to security vulnerabilities was a Bluetooth OBD-II reader. I gave Claude the APK and asked it to reverse engineer the BLE protocol so that I could use the device from Python. There was apparently a ton of obfuscation in the APK, with the actual BLE logic buried inside an obfuscated native code library instead of Java code. Claude eventually asked me to install the Android emulator so that it could use https://frida.re to do dynamic instrumentation instead of relying entirely on static analysis. The output was impressive.
If you need to access someone's account or decrypt their hard drive, brute force is an effective way to do it.
https://blog.mozilla.org/en/firefox/hardening-firefox-anthro...
I think the Mozilla example is a good one because its a large codebase, lots of people keep asking "how does it do with a large codebase" well there you go.
It found the bug man. You didn't even read the advisory. It was credited to "Nicholas Carlini using Claude, Anthropic".
I haven't been able to find a write-up on the bug finding, and since the advisory didn't contain any details at all, it's unclear to me if Claude was just used to proof read the submission, actively used to help find the bug, or if it found it in a more autonomous way.
FreeBSD kernel is written in C right?
AI bots will trivially find CVEs.
https://blog.calif.io/p/mad-bugs-claude-wrote-a-full-freebsd
A reminder: this bug was also found by Claude (specifically, by Nicholas Carlini at Anthropic).
https://www.youtube.com/watch?v=1sd26pWhfmg
Looks like LLMs are getting good at finding and exploiting these.
A theoretical random tweet and a clear demonstration are two different things.
What about FreeBSD 15.x then? I didn't see anything in the release notes or the mitigations(7) man page about KASLR. Is it being worked on?
NetBSD apparently has it: https://wiki.netbsd.org/security/kaslr/
[kmiles@peter ~]$ cat /etc/os-release
NAME=FreeBSD
VERSION="13.3-RELEASE-p4"
VERSION_ID="13.3"
ID=freebsd
ANSI_COLOR="0;31"
PRETTY_NAME="FreeBSD 13.3-RELEASE-p4"
CPE_NAME="cpe:/o:freebsd:freebsd:13.3"
HOME_URL="https://FreeBSD.org/"
BUG_REPORT_URL="https://bugs.FreeBSD.org/"
[kmiles@peter ~]$ sysctl kern.elf64.aslr.enable
kern.elf64.aslr.enable: 1
Automatic discovery can be a huge benefit, even if the transition period is scary.
Nevertheless, attacking is a targeted endeavour, unlike defense. Fixing is, in _general_, more difficult in theory.
* reference to past google and ffmpeg incident
I am hoping that quite soon we will have general acceptance of the fact that "Claude can write code" and we will switch focus to how good / not good that code is.
Here's what I'm referring to: https://github.com/califio/publications/blob/7ed77d11b21db80...
"demon strait". Was this speech to text? That might explain the punctuation and grammar.
https://blog.calif.io/p/mad-bugs-claude-wrote-a-full-freebsd
But I found the exploit writeup pretty interesting
This is what Claude is meant to be able to do.
Preventing it doing so is just security theater.
Is it possible to pwn without SSH listening?
> Attack surface: NFS server with kgssapi.ko loaded (port 2049/TCP)
Not sure who would run an internet exposed NFS server. Shodan would know.