> Self Modifying and JIT-Compiled Code. Elevator, like all fully static binary rewriters, does not support self modifying or just-in-time-compiled code.
In x86 land, it's hard to find the instruction boundaries statically, because, for historical reasons going back to the 8-bit era, x86 nstructions don't have alignment restrictions. This is what makes translation ambiguous.
If you start at the program entry point and start examining reachable instructions, you can find the instruction boundaries. Debuggers and disassemblers do this. Most of the time, it works, but You may have to recognize things such as C++ vtables. Debug info helps there. There may be ambiguity. This seems to be about generating all the possible code options to resolve that ambiguity by brute force case analysis.
x86 doesn't have explicit code/data separation, which some architectures do. So they have to try instruction decoding on all data built into the executable. They cull obvious mistranslations. Yet they still have a 50x space expansion, someone mentioned. Most of those will be unreachable mistranslated code.
You can't look at a static executable which uses pointers to functions and say "that data cannot possibly be code", without constraining what those pointers point to. That involves predicting run-time behavior, which may not be possible.
If it did, it wouldn't be "fully static" anymore. It's fundamentally contradictory.
There's a lot of x86 crufty edge-cases to handle to achieve perfect(ish) emulation or translation.
After those machines, at the Pentium Pro, with look-ahead instruction decoding, it became a major lose to store into code. Superscalar x86 CPUs have the hardware to detect and handle stores into code, but it requires bringing the CPU to a clean halt, almost like an exception interrupt, discarding pipelined work that's already been done, and then restarting the pipeline, reloading the instructions ahead. All the performance gains of superscalar hardware is lost for a while.
There are RISC architectures where self-modifying code isn't supported, and code pages must be read-only. Then the CPU doesn't need the machinery for detecting and aborting look ahead on a store into code. MacOS has enforced that rule since the PowerPC era.