undefined

upvote

points

by userbinator22 hours ago |

upvote

by tptacek14 hours ago|

[-]

"No one bothered to look" is how most vulnerabilities work. Systems development produces code artifacts with compounding complexity; it is extraordinarily difficult to keep up with it manually, as you know. A solution to that problem is big news.

Static analyzers will find all possible copies of unbounded data into smaller buffers (especially when the size of the target buffer is easily deduced). It will then report them whether or not every path to that code clamps the input. Which is why this approach doesn't work well in the Linux kernel in 2026.

reply

upvote

by rubendev13 hours ago|

[-]

With a capable static analyzer that is not true. In many common cases they can deduce the possible ranges of values based on branching checks along the data flow path, and if that range falls within the buffer then it does not report it.

reply

upvote

by tptacek12 hours ago|

[-]

Be specific. Which analyzer are you talking about and which specific targets are you saying they were successful at?

reply

upvote

by canucker201610 hours ago|

[-]

Intrinsa's PREfix static source code analyzer would model the execution of the C/C++ code to determine values which would cause a fault.

IIRC they were using a C/C++ compiler front end from EDG to parse C/C++ code to a form they used for the simulation/analysis.

see https://web.eecs.umich.edu/~weimerw/2006-655/reading/bush-pr... for more info.

Microsoft bought Intrinsa several years ago.

reply

upvote

by tptacek8 hours ago|

[-]

I'm sure this is very interesting work, but can you tell me what targets they've been successful surfacing exploitable vulnerabilities on, and what the experience of generating that success looked like? I'm aware of the large literature on static analysis; I've spent most of my career in vulnerability research.

reply

upvote

by canucker20166 hours ago|

[-]

PREfix wasn't designed specifically for finding exploitable bugs - it was aimed somewhere in between Purify (runtime bug detection) and being a better lint.

One of the articles/papers I recall was that the big problem for PREfix when simulating the behaviour of code was the explosion in complexity if a given function had multiple paths through it (e.g. multiple if's/switch statements). PREfix had strategies to reduce the time spent in these highly complex functions.

Here's a 2004 link that discusses the limitations of PREfix's simulated analysis - https://www.microsoft.com/en-us/research/wp-content/uploads/...

The above article also talks about Microsoft's newer (for 2004) static analysis tools.

There's a Netscape engineer endorsement in a CNet article when they first released PREfix. see https://www.cnet.com/tech/tech-industry/component-bugs-stamp...

reply

upvote

by mrshadowgoose15 hours ago|

[-]

> Not "hidden", but probably more like "no one bothered to look".

Well yeah. There weren't enough "someones" available to look. There are a finite number of qualified individuals with time available to look for bugs in OSS, resulting in a finite amount of bug finding capacity available in the world.

Or at least there was. That's what's changing as these models become competent enough to spot and validate bugs. That finite global capacity to find bugs is now increasing, and actual bugs are starting to be dredged up. This year will be very very interesting if models continue to increase in capability.

reply

upvote

by literalAardvark11 hours ago|

[-]

I was just thinking about this and what it means for closed source code.

Many people with skin in the game will be spending tokens on hardening OSS bits they use, maybe even part of their build pipelines, but if the code is closed you have to pay for that review yourself, making you rather uncompetitive.

You could say there's no change there, but the number of people who can run a Claude review and the number of people who can actually review a complicated codebase are several orders of magnitude apart.

Will some of them produce bad PRs? Probably. The battle will be to figure out how to filter them at scale.

reply

upvote

by dolmen9 hours ago|

[-]

I have no doubt that LLMs can be as good at analyzing binaries than at analyzing source code.

An avalanche of 0-day in proprietary code is coming.

reply

upvote

by NitpickLawyer22 hours ago|

[-]

> This is something a lot of static analysers can easily find.

And yet they didn't (either noone ran them, or they didn't find it, or they did find it but it was buried in hundreds of false positives) for 20+ years...

I find it funny that every time someone does something cool with LLMs, there's a bunch of takes like this: it was trivial, it's just not important, my dad could have done that in his sleep.

reply

upvote

by userbinator22 hours ago|

[-]

Remember Heartbleed in OpenSSL? That long predated LLMs, but same story: some bozo forgot how long something should/could be, and no one else bothered to check either.

reply

upvote

by dlopes717 hours ago|

[-]

Hey we are the bozos

reply

upvote

by braiamp17 hours ago|

[-]

Lets all get together and self-reflect on the bozos way.

reply

upvote

by sam_bristow9 hours ago|

[-]

I believe that once the OpenBSD team started cleaning up some of the other gross coding style stuff as part of their fork into LibreSSL that even fairly simplistic static analysis tools could spot the underlying bugs that caused heartbleed.

reply

upvote

by tptacek7 hours ago|

[-]

The bug that caused Heartbleed was extremely obvious: read a u16 out of a packet, copy that many bytes of the source packet into the reply packet. If someone put that code in front of you in isolation you would spot it instantly (if you know C). The problem --- this is hugely the case with most memory safety bugs --- is that it's buried under a mountain of OpenSSL TLS protocol handling details. You have to keep resident in your brain what all the inputs to the function are, and follow them through the code.

reply

upvote

by choeger16 hours ago|

[-]

It's much, much, easier to run an LLM than to use a static or dynamic analyzer correctly. At the very least, the UI has improved massively with "AI".

reply

upvote

by pixl979 hours ago|

[-]

Most people have no idea how hard it is to run static analysis on C/C++ code bases of any size. There are a lot of ways to do it wrong that eat a ton of memory/CPU time or start pruning things that are needed.

If you know what you're doing you can split the code up in smaller chunks where you can look with more depth in a timely fashion.

reply

upvote

by mrshadowgoose15 hours ago|

[-]

And even if that's true (and it frequently is!), detractors usually miss the underlying and immense impact of "sleeping dad capability" equivalent artificial systems.

Horizontally scaling "sleeping dads" takes decades, but inference capacity for a sleeping dad equivalent model can be scaled instantly, assuming one has the hardware capacity for it. The world isn't really ready for a contraction of skill dissemination going from decades to minutes.

reply

upvote

by pjmlp18 hours ago|

[-]

Most likely no-one runned them, given the developer culture.

reply

upvote

by wat100009 hours ago|

[-]

There’s the classic case of the Debian OpenSSL vulnerability, where technically illegal but practically secure code was turned into superficially correct but fundamentally insecure code in an attempt to fix a bug identified by a (dynamic, in this case) analyzer.

reply