It wasn't until a real incident that we learned: (a) the system wasn't resilient to the utility power going on-off-on-off-on-off as each 'off' drained the batteries while the generator started, and each 'on' made the generator shut down again; (b) the ops PCs were on UPSes but their monitors weren't (C13 vs C5 power connector) and (c) the generator couldn't be refuelled while running.
Even if you've got backup systems and you test them - you can never be 100% sure.
Turtles all the way down.
At AWS scale even unlikely hardware events become more common I guess.