In addition, you can co-author a plan for a biggish chunk of work, divided into stages, have it launch a subagent for phase 1 and check its work, then ESC-ESC to go back to just after you wrote the plan and have it do phase 2. Repeat until done. This keeps the overall goal in the main context for the review, but clears out previous reviews. Kind of like a workflow but with more control.
It's happened multiple times where I give it a task before going to sleep and when I come back it's stuck halfway through on some stupid summary, where my only response needed is basically "Yeah, continue." even though I use Opus. Using workflows for the higher level planning helped with that and those annoying pauses no longer happen, perhaps due to the main conversation being much shorter and apparently not enough for the weights to nudge towards user confirmation.
I imagine most harnesses should have a way to do this today, if they don't, get a new one. OpenCode i.e. is highly customizable, Claude and VS Code both support a ton as well including custom agents (though unclear if you can create custom top-level in claude-code)
https://opencode.ai/docs/agents/
https://code.claude.com/docs/en/sub-agents
https://code.visualstudio.com/docs/agent-customization/custo...
the main agent would be very different, basically an orchestrator, and you are "loop engineering" it, and turning off all the things for this main agent besides being able to run subagents
for opencode:
https://opencode.ai/docs/agents/#permissions (what tools, mcp, etc...)
https://opencode.ai/docs/agents/#task-permissions (what subagents it can call)
https://opencode.ai/docs/agents/#additional (thinking effort)
I see it pretty frequently in troubleshooting and data analysis flows where it will dump the data collection and aggregation into a sub agent then pull out a summarized result.
I'll do something similar where I have the main agent maintain context in a design doc/markdown file and update as it goes along. Then I can clear/restart/handoff at will
In a way it's a generalization of the spec-driver approach, but in addition to the the formal spec the carryover buffer lives in the memory.
AI vendors still need to compete with each other both in terms of token cost and competency. An agent that is costly and less effective by wasting tokens is less competitive.
[User] Actual human prompt
[Agent] Attempted use of tool & hand slap
[Agent] call(projection of user's prompt relative to discovered tool constraints)
["User"] Prompt from above call
[Agent] Legal tool use
[Agent] ... until satisfied
[Agent] return(summary that satisfies the prompt for this level of execution)
[Agent] Additional call() invokes possible depending on returned summary
[Agent] Final return(summary) from root ends this turn of conversation and user sees summary
[User] Next turn of conversation initiated by actual humanI don't bother warning it in the system prompt anymore. It's pointless. I let it bump its head as required. A few hundred tokens and the agent is back on track each time.
The only tools permissible to root in my scheme are call() and return().