upvote
models are getting weirdly good at hacking while still sort of sucking at a bunch of economically valuable tasks

like most human hackers

reply
They are training them on decompilation and reverse engineering/blackbox reimplementations/pentesting because it’s one of the best ways to generate interesting and rare RL traces for agentic coding AND teach them how lots of things work under the hood.

Just throw Claude at millions of binaries and you can get amazing training data. Oh wait 4.7 gives you refusals for that now

reply
Honestly I feel sometimes like about the only thing they do successfully is hacking. Not just in the sense of breaking into systems that are assumed to be secure although also in that sense. They're just, highly effective at fumbling around with a hatchet until something works. We just happen to have version control and automated testing that generally makes that approach somewhat viable for the task of programming. But while I've been genuinely impressed at how much it can put features into a workable state, I've never been confident looking at its output that it's going to do more than POC quality at the current state of things. But it's pretty dang effective at that given enough time and a space safe to hack away and reset until the product looks close enough.
reply