Imagine a superhuman agent who does not need to run in endless loops. It could generate 100k line code-base in a few minutes or solve smaller features in seconds.
In a way, the inefficiency is what leads people to parallelism. There is only room for it because the agents are slow, perhaps the more inefficient and slower the individual agents are, the more parallel we can be.
So we are just now getting agents which can reliably loop themselves for medium size tasks. This generation opens a new door towards agent-managing-agents chain of thoughts data. I think we would only get multi-agents with high reliability sometimes by the mid to end of 2026, assuming no major geopolitical disruption.