Of course they do say that you should review/test everything the tool creates, but in most contexts, it's sort of added as an afterthought.
I'm looking at the ticket opened, and you can't really be claiming that someone who did such a methodical deep dive into the issue, and presented a ton of supporting context to understand the problem, and further patiently collected evidence for this... does not know how to prompt well.
I started doing this a while ago (months) precisely because of issues as described.
On the other hand,analyzing prompts and deviations isnt that complex.. just ask Claude :)