tl;dr: the bug spans three components in different code bases that when looked at in isolation each do reasonable things. The bug is in the interaction, in the assumed properties of the value that eventually gets exposed as request.url.path. That was apparently too subtle for current Anthropic models to spot
Your response was a weak excuse, it’s a clear demonstration of the shortcomings of LLMs which will inevitably cause headlines in the future.
Whether "LLM failed to spot vulnerability that took humans 8 years to find" is a great headline about shortcomings of LLMs is questionable, but it is a good example of a category of bug that is particularly hard to spot for humans and LLMs alike