I'd previously encountered tools that seemed interesting, but as soon as I tried getting it to run I found myself going down an infinite debugging hole. With an LLM I can usually explain my system's constraints and the best models will give me a working setup from which I can begin iterating. The funny part is that most of these tools are usually AI related in some way, but getting a functional environment often felt impossible unless you had really modern hardware.
I use Claude Code a decent amount, and I actually find that sometimes this can be the opposite for me. Sometimes it is actually missing other areas that the change will impact and causing things to break. Sometimes when I go to test it I need to correct it and point out it missed something or I notice when in the planning phase that it is missing something.
However I do find if you use a more powerful opus model when planning, it does consider things fully a lot better than it used to. This is actually one area I have been seeing some very good improvements as the models and tooling improves.
In fact, I actually hope that these AI tools keep getting better at the point you mention, as humans also have a "context limit". There are only so many small details I can remember about the codebase so it is good if AI can "remember" or check these things.
I guess a lot of the AI can also depend on your codebase itself, how you prompt it, and what kind of agents file you have. If you have a robust set of tests for your application you can very easily have AI tools check their work to ensure things aren't being broken and quickly fix it before even completing the task. If you don't have any testing more could be missed. So I guess it's just like a human in some sense. If you have a crappy codebase for the AI to work with, the AI may also sometimes create sloppy work.
There is a counter issue though, realizing mid session that the model won’t be able to deliver that last 10%, and now you have to either grok a dump of half finished code or start from scratch.