I'm not sure quite what you're asking for exactly, given the link is for clang trunk and doesn't have the modifications discussed in TFA, and I don't dispute that clang does UB-based reasoning at -O3. But, I will argue that the assembly shown can be accomplished without resorting to what I call "reasoning about UB", and within a flat memory model, supporting the claim that these sacrifices are often not necessary. I'm going to draw a distinction between stack memory being "private" in the sense that only the compiler is allowed to alter it, and "public" where the address can be written to by something else and the compiler needs to handle that. Local variables at first are tracked privately. After the address of a variable is taken with &x, or at the point in time when an array variable is indexed, the associated memory is public. Conceptually, the use of private memory can be indirect; the compiler could encode/decode a stack variable x as (x XOR 0xDEADBEEF) on the stack and it would be fine (but the compiler does the simple thing in practice, naturally). Note that this notion of "private"/"public" stack memory is a property of addresses, not the provenance of the accessing pointers, and so is fully compatible with a flat memory model. The compiler's ability to use private memory isn't a case of "reasoning around UB" in a meaningful sense -- otherwise you could just as well argue that returning from a function call is "reasoning about UB", because the the return address can't be clobbered.
In your provided snippet, the correctness argument for the assembly in a non-UB-reasoning universe goes like this: at first, i is stored privately on the stack with value zero, and so as an optimization we can assume that value is still zero without rereading. Only later, when &i is taken, is that memory made public and the compiler has to worry about something altering it. In actual execution, the problem is that the write function alters compiler-private memory (and note again, that being private is a property of the underlying address, not the fact that it's accessed via an out-of-bounds array indexing), and this is UB and so the program breaks. But, the compiler didn't need to make _assumptions_ around UB.