upvote
I worked a bit on the extraction process so I can chime in here a bit. The first part is to just mark the x,y locations of where all the bits are, generally by the intersection of the rows and columns of the microcode array.

Then you have to classify them as 0's or 1's. Each is visually distinct, a 1 being encoded by the presence of a transistor and a gap in the polysilicon. We didn't have to guess which is which is by the nature of Intel microcode we could assume 0's were much more frequent, so a transistor meant a 1.

There are some automatic tools designed to perform this work via color thresholding, but they didn't work very well here because some of the mosaic was blurry, and a lot of dust had crept in which created false 1 bits.

Instead, we trained a convolutional neural network to classify the extracted bit regions into 0's and 1's. This was overlaid back onto the original mosaic as white or black squares at 50% opacity.

Then we spent several long, tedious days just checking the results for errors. Finally we had the raw 2d array of bits - the next step is to extract the microcode words from the bit array.

reply
Here's a video of some guys de layering the chips for the Nintendo 64 lockout mechanism. It's pretty in-depth and it goes over a lot of different ways they do this.

https://youtu.be/HwEdqAb2l50?si=VFLed64PZvpCHfy1

reply
Thanks for sharing, will definitely watch it!
reply
Just look at the images[1].

> The photo above shows part of the microcode ROM. Under a microscope, the contents of the microcode ROM are visible, and the bits can be read out, based on the presence or absence of transistors in each position.

[1]: https://www.righto.com/2020/06/a-look-at-die-of-8086-process...

reply
The microcode is in a ROM. It's a regular structure where a 1 looks different to a 0.
reply
Yes, literally this. No verilog decode, just looking for signals in the image of a 1 vs. a 0. For example, a 1 may be the existence of a transistor at a particular intersection of wiring.
reply
Right. And the best way to think about microcode is as code for a wacky, custom VLIW processor that implements the programmer-level x86 (in this case) instruction set. Various fields in the microcode send signals to different parts of the processor to activate them, routing values along internal busses and between registers, functional units and memory to cause the processor to execute the x86 instructions.
reply
So what you actually need is a program that navigates through the huge image of the die and detects if the structure that is looking at is a 1 or a 0? This at the fundamental level is a cross between machine learning and image processing?
reply
I helped out on this image-to-bits transcription, doing manual verification of the automated work. I did the whole thing by hand: I sliced the ROM images into strips that excluded parts of the image that don't encode bits, used my tablet and stylus to manually place a black dot on every 1 bit, then wrote a trivial program that detected the presence or absence of the black dot in each cell. From my perspective, the ROM is organized like a series of "ladders" where the 1 bits are missing legs of the ladder, and I was placing dots on the missing legs. I compared my results with the ML output and manually re-checked each bit where we disagreed.

http://brianluft.com/images/2026/05/386_microcode_bits.jpg -- my fully annotated result. I was working from a higher-quality PNG; this is highly compressed because it's a big image.

reply
Yes, exactly. Historically you would make some simple image processing software that will align the grid and then look for properties at each specific bit position. Usually die shots are highly imperfect (the delayering usually leaves some artifacts or damage) so frequently merging multiple scans is important as well. Travis Goodspeed has a neat tool for this workflow at https://github.com/travisgoodspeed/maskromtool and the blog mentions John McMaster’s bitract: https://github.com/SiliconAnalysis/bitract although I think most people working on these projects usually just one-off it as the mentioned Discord users in the blog post eventually did.

More modern devices are of course more difficult due to layers, feature size, and less visually obvious ROM bit designs.

Anyway, the impressive part of this project was really understanding the undocumented microcode assembly language through inference and trace following; the 1s and 0s look like they were the easy part!

reply
The full workflow seems to look something like this, with the added complications relative to the 8086 microcode being that the 80386 microcode acts as an orchestration layer on top of hardwired engines, programmable logic arrays, and fault/protection redirection. The 8086 microcode does all that algorithmically, reusing the same hardware instead of having dedicated transistors.

1. Extract the ROM bits. 2. Determine physical-to-logical bit ordering. 3. Identify microinstruction boundaries. 4. Infer field boundaries. 5. Associate fields with hardware destinations (check with die tracing). 6. Decode instruction-dispatch programmable logic arrays. 7. Associate x86 instructions with microcode entry points. 8. Infer repeated idioms: moves, ALU ops, termination, calls, tests, redirects. 9. Decode accelerator protocols. 10. Validate against known architectural behavior.

reply