Tools, memories, sandboxing, steering, etc
Because there isn't really much more to it. And ever since we, i.e. those of us who played with ChatGPT API early on, bolted tools to it, some half a year before OpenAI woke up and officially named it "function calling" - ever since then, we knew that harness was the key. What kept changing was which logic (and how much of it) to put in explicitly, vs. pushing it back to the model on the "main thread", vs. pushing it to a model on a separate conversation track. But the basic insight remains the same.
--
[0] - Well, today - until recently you'd call it a "runner" or "runtime".
AI companies would love if everything ran in their cloud, but arguably there are latency reasons or other reasons to run at least some stuff in your own computer
The harness is the part that makes the API calls, interacts with the user, makes the function calls, and keeps track of the conversation memory.
You can also use the LLM to summarize the conversation into a single shorter message so you get compaction. And instead of statically defining which functions are available to the LLM you can create an MCP server which allows the LLM to auto-discover functions it can call and what they do.
That’s the whole magic of something like Claude Code. The rest is details.
Personally, for me it embodies a level of autonomy. I define that as, an AI model with potential to interact with something external to itself based on its output, where that includes its own future behavior.