The point is they mostly wind up somewhere stupid, and it takes expertise to spot and correct that. (Maybe that changes with further development.)
It's essentially a "brute force" approach, but in most cases, they only need to succeed once.
The article’s point is this is not true. They wind up in bullshit attractors where they hit a wall and then get lost within their muddled context window.
> they only need to succeed once
Yet they don’t. Not on their own. Like, you haven’t had an LLM get stuck in a stupid loop where you point out the flaw and then it gets unstuck?