upvote
With more bits, then SUB is going to be more and more expensive to fit in the same number of clocks as XOR. So with an 8-bit CPU like Z80, it probably makes design sense to have XOR and SUB both take one cycle. But if for instance a CPU uses 128-bit registers, then the propagate-and-carry logic for ADD/SUB might take way much longer than XOR that the designers might not try to fit ADD/SUB into the same single clock cycle as XOR, and so might instead do multi-cycle pipelined ADD/SUB.

A real-world CPU example is the Cray-1, where S-Register Scalar Operations (64-bit) take 3 cycles for ADD/SUB but still only 1 cycle for XOR. [1]

[1] https://ed-thelen.org/comp-hist/CRAY-1-HardRefMan/CRAY-1-HRM...

reply
The 6502 doesn't support XOR A or SUB A, and in fact doesn't have a SUB opcode at all, only SBC (subtract with carry, requiring an extra opcode to set the carry flag beforehand).
reply
I was handwaving over the details, SBC is identical to SUB when the carry flag is clear, so it's understandable why the 6502 designers didn't waste an instruction slot.

EOR and SBC still have the same cycle counts though.

reply
Cortex A8 vsub reads the second source register a cycle earlier than veor, so that can add one cycle latency

Not scalar, but still sub vs xor. Though you’d use vmov immediate for zeroing anyway.

reply
deleted
reply
Harvard Mark I? Not sure why people think programming started with Z80.
reply
The article is about x86, and x86 assembly is mostly a superset of 8080 (which is why machine language numbers registers as AX/CX/DX/BX, matching roughly the function of A/BC/DE/HL on the 8080—in particular with respect to BX and HL being last).
reply
My WW2-era assembly is a bit rusty, but I don't think the Harvard Mark 1 had bitwise logical operations?
reply