How do you drive the probability of some series of tokens down to some known, acceptable threshold? That's a $100B question. But even if you could - can you actually enumerate every failure mode and ensure all of them are protected? If you can, I suspect your problem space is so well specified that you don't need an AI agent in the first place. We use agents to automate tasks where there is significant ambiguity or the need for a judgment call, and you can't anticipate every disaster under those circumstances.
You’re absolutely right the probability is low. According to my calculations, you’re more likely to get struck by lightning twice on the same day and drown in a tsunami.
Yet in this case, that probability clearly isn't smaller than a meteorite strike.