undefined

points

[-]

Intel had given us some clues - they had written somewhere that the 386 had 2560 microcode words. The microcode array has 37 banks - each bank resolves one bit from the 37 bits that comprise a microcode word. But which way to decode them? From top down? Bottom up? Were they interleaved in weird ways?

Documentation from the NEC vs Intel lawsuit ended up documenting the microcode word format for both the 8088 and NEC V20 CPUs, but unfortunately, we were on our own for the 386. But we could take educated guesses - working off the 8088 field format, what additional microcode fields would a 386 add? What fields would expand and how many bits would they need?

We used a lot of python scripts to decode the microcode array into 37-pixel wide, very long bitmaps, in different permutations, to see if any vertical patterns emerged that would hint to us the boundaries of microcode word fields. And some did emerge!

by GloriousCow41 minutes ago|

parent|

[-]

We also had decoded the 386's match-decoder PLA, so we knew roughly the locations of different opcodes were in the microcode itself, which was very helpful. Some opcodes have very specific operands, so would have unique field references. Some forms only operate on EAX/AX, for example, so if you find those instructions you have a hint of how the AX register is encoded as an operand.

Other instructions like PUSHA and POPA are implemented as loops that iterate by incrementing the fields corresponding to registers - and we know in what order they operate.

Bit by bit, relation by relation, you can puzzle out the format of the microcode. Of course, this is glossing over the enormous added complexity of protected-mode operations. This was a herculean effort by reenigne, and I don't think it is hyperbole to call it one of the more impressive human achievements I have witnessed in my lifetime.

by GloriousCow18 minutes ago|

parent|

[-]

The actual output of microcode disassembly is just a text file - a line of code for each microcode word, in essentially a a new dialect - a type of static assembly language. reenigne had to invent names for a lot of things, that will now become the official names of these things, unless Intel ever decides to speak up and make corrections.

That language can then be translated into Verilog, and has been.