upvote
What would be the difficulty level for it to just read the machine code; are these models heavily relying on human language for clues?
reply
Reasoning on pure machine code or disassembly is still hit and miss. For better results you can run the binary through a disassembler, then ask an llm to turn that into an equivalent c program, then ask it to work on that. But some of the subtleties might get lost in translation
reply
If you put codex in Xhigh and allow it access to tools, it will take an hour but it will eventually give you back quality recompiled code, with the same issues the original had (here quality means readable)
reply
I had a bit of a pain of a time trying to get Claude to work with ghidra. What you’re describing seems like a better alternative, would you agree?
reply
I've had a lot of luck with pyghidra-mcp -- give it a try :)
reply
Well i have tried and it only works for simple use-case.
reply
You can tweak the current Ghidra MCP to work in headless mode. It makes things much easier.
reply
I have had Claude read usbpcap to reverse engineer an industrial digital camera link. It was like pulling teeth but I got it done (I would not have been able to do it alone)
reply
Paired with Ghidra having a binary, being able to do a memory dump of a live running program, and being able to use wireshark to dump traffic over network/bluetooth/usb is VERY helpful if you don't have the source code.

You use decompilation tools and hope they left debug symbols in and it turns it into somewhat human-readable language which is often enough. Even when you don't binaries use libraries which are known or at some point hit documented interfaces so things can be reasoned about.

reply
I had Claude reverse some firmware. I gave it headless ghidra and it spat out documentation for the internal serial protocol I was interested in. With the right tools, it seems to do pretty well with this kind of task.
reply
It will have to use a disassembler, or write one. I recently casually asked gpt-5.4 to translate the content of a MIDI file to a custom sound programming language. It just wrote a one-shot MIDI parser in Python, grabbed the data, and basically did a perfect translation at first try. Nice.
reply
I've seen Claude do similar things for image files. Don't have PNG parsing utilities installed? No worries, it'll just synthesize a Python script to decode the image directly.
reply
That's a pretty big gimme!
reply
It's not a far step from having the firmware binaries and doing analysis with ghidra, etc.
reply