I have been "agentic coding" since Sonnet 3.5 and after this paper came out, it became my bible:
https://github.com/adobe-research/NoLiMa
Last I checked, all models suck as you fill the context window. "Context engineering" is how you do this whole thing.