upvote
"Exploiting undefined behavior" occurs when a simple semantics (however one defines "simple") results in behavior A, but the compiler chooses behavior B instead based on the actual, more complex, language semantics. The code snippet in question passes that test. If I flip the declaration order of i and arr, then I get this [1] at -O0 (the "simple" semantics):

        push    rbp
        mov     rbp, rsp
        mov     dword ptr [rbp - 4], 0
        mov     dword ptr [rbp - 4], 5
        mov     eax, dword ptr [rbp - 4]
        pop     rbp
        ret
Which indeed returns 5. But at -O2 clang optimizes it to this:

        xor     eax, eax
        ret
Which returns 0. So the simple semantics produces one result, and the complex semantics produces another. Hence, it's exploiting undefined behavior.

[1]: https://godbolt.org/z/df4dhzT5a

reply
Maybe this is just arguing semantics, but I don't agree with the definition you've given, and I don't think that your definition is what TFA means. "Exploiting undefined behavior" I think refers narrowly to the implementation assuming that undefined behavior does not occur, and acting on that assumption. Undefined behavior naturally resulting in unpredictable behavior is not exploitation in the same sense. For example,

  printf("A");
  bool x;
  if ( x ) {printf("B");} else {printf("C");}
  printf("\n");
If at -O0 "AB" is printed and at -O2 "AC" is printed (due to the vagaries of whatever was left on the stack), then that would meet your definition, but I would not regard that as "exploiting undefined behavior", merely as the manifestation of the inherent unpredictability of UB. If the compiler didn't print anything (i.e. the whole block was removed due to UB detection) then that _would_ be a case of exploiting undefined behavior.
reply
That example is an instance of unspecified vs. undefined behavior, but the correctness of the pointer provenance-based optimization example I gave doesn't depend on whether writing to an out-of-bounds pointer is unspecified or undefined.
reply
What about this: https://godbolt.org/z/xP9xG3Ee3

Here the compiler "register allocates" i for some reads but not for others.

i gets stack allocated, but some uses of it act as though they were register allocated.

reply
I'm not sure quite what you're asking for exactly, given the link is for clang trunk and doesn't have the modifications discussed in TFA, and I don't dispute that clang does UB-based reasoning at -O3. But, I will argue that the assembly shown can be accomplished without resorting to what I call "reasoning about UB", and within a flat memory model, supporting the claim that these sacrifices are often not necessary. I'm going to draw a distinction between stack memory being "private" in the sense that only the compiler is allowed to alter it, and "public" where the address can be written to by something else and the compiler needs to handle that. Local variables at first are tracked privately. After the address of a variable is taken with &x, or at the point in time when an array variable is indexed, the associated memory is public. Conceptually, the use of private memory can be indirect; the compiler could encode/decode a stack variable x as (x XOR 0xDEADBEEF) on the stack and it would be fine (but the compiler does the simple thing in practice, naturally). Note that this notion of "private"/"public" stack memory is a property of addresses, not the provenance of the accessing pointers, and so is fully compatible with a flat memory model. The compiler's ability to use private memory isn't a case of "reasoning around UB" in a meaningful sense -- otherwise you could just as well argue that returning from a function call is "reasoning about UB", because the the return address can't be clobbered.

In your provided snippet, the correctness argument for the assembly in a non-UB-reasoning universe goes like this: at first, i is stored privately on the stack with value zero, and so as an optimization we can assume that value is still zero without rereading. Only later, when &i is taken, is that memory made public and the compiler has to worry about something altering it. In actual execution, the problem is that the write function alters compiler-private memory (and note again, that being private is a property of the underlying address, not the fact that it's accessed via an out-of-bounds array indexing), and this is UB and so the program breaks. But, the compiler didn't need to make _assumptions_ around UB.

reply