upvote
Use of the "h" register slices (bits 8..15) by compilers is thankfully pretty rare -- otherwise this would have been noticed much sooner!

Agner Fog's optimization guide says "Any use of the high 8-bit registers AH, BH, CH, DH should be avoided because it can cause false dependences and less efficient code."

reply
> Use of the "h" register slices (bits 8..15) by compilers is thankfully pretty rare -- otherwise this would have been noticed much sooner!

It's actually pretty easy to get compilers to use those, you mainly need a bunch of narrow accesses to neighboring memory. The oodle post contains a godbolt link to pretty ordinary c code triggering this.

I'd guess that you also need some other conditions (multiple in flight stores, high boost speeds) to trigger this.

reply
Use of the "h" register slices (bits 8..15) by compilers is thankfully pretty rare

That's unfortunate, because it's precisely why things like this will keep happening.

Agner Fog's optimization guide says "Any use of the high 8-bit registers AH, BH, CH, DH should be avoided because it can cause false dependences and less efficient code."

The sad vicious cycle of compilers not exercising the hardware, and then the hardware designers not paying attention. Using the high 8-bit registers and "implicitly merging" them is one of the ways to reduce the number of instructions and thus improve size optimisation.

reply
> That's unfortunate, because it's precisely why things like this will keep happening.

I have the opposite opinion. Its use being rare means CPU designers have less need to optimize for that rare case, and hardware optimizations are precisely where these kinds of issues tend to pop up.

And high 8-bit registers are a x86-specific feature, other CPU families don't have it. So that special case being less optimized (or even pessimized) is not much of a loss.

reply
Yep. The "high" registers as an alias for bits 8-15 of certain registers are one of many warts in the architecture; they should have been purged from 32-bit and 64-bit code, and left to rot in 16-bit mode only.

Intel blew it when they let them continue to work in to 32-bit code on the 386, and then AMD blew it when they repeated the mistake when defining the 64-bit ISA.

reply
> The sad vicious cycle of compilers not exercising the hardware

There could theoretically be instruction selection passes that are biased toward rare instructions, specialized for fuzzing hardware, I'm surprised Intel doesn't already do that.

reply