upvote
How is that a direct comparison? The link you gave has a quote that says it’s not:

> Scoped context: Our tests gave models the vulnerable function directly, often with contextual hints (e.g., "consider wraparound behavior"). A real autonomous discovery pipeline starts from a full codebase with no hints

They pointed the models at the known vulnerable functions and gave them a hint. The hint part is what really breaks this comparison because they were basically giving the model the answer.

reply
Does no one defending mythos understand how nested foreloops work?

loop through each repo: loop through each file: opencode command /find_wraparoundvulnerability next file next repo

I can run this on my local LLM and sure, I gotta wait some time for it to complete, but I see zero distinguishing facts here.

reply
The question is how customized those hints were. That changes whether looping over an entire code base is possible or not.
reply
Please do so, looking forward to your write up
reply