When a model is trained on multi-contexts, some growing over time like we see now (conversations), some rolling at various sizes (as in, always on), such as a clock, video feed, audio feed, data streams, tool calling, we no longer have to 'pollute' the main context with a bunch of repetitive data.
But this is going in the direction of 1agent=1mind. When much more likely human (and maybe all cognition) requires 'ghosts' and sub processes. It is much more likely an agent is more like a configurable building piece to a(n alien) mind.
The problem is that the models are not trained for this, nor for any other non-standard agentic approach. It's like fighting their 'instincts' at every step, and the results I've been getting were not great.
This is absolutely the hardest bit.
I guess the short-cut is to include all the chat conversation history, and then if the history contains "do X" followed by "no actually do Y instead", then the LLM can figure that out. But isn't it fairly tricky for the agent harness to figure that out, to work out relevancy, and to work out what context to keep? Perhaps this is why the industry defaults to concatenating messages into a conversation stream?
As you build up a "body of work" it gets better at handling massive, disparate tasks in my admittedly short experience. Been running this for two weeks. Trying to improve it.
Or maybe they haven't thought about it?
Or they tried some simple alternatives and didn't find clear benefits?
> The key is to give the agent not just the ability to pull things into context, but also remove from it.
But then you need rules to figure out what to remove. Which probably involves feeding the whole thing to a(nother?) model anyway, to do that fuzzy heuristic judgment of what's important and what's a distraction. And simply removing messages doesn't add any structure, you still just have a sequence of whatever remains.
Of course Anthropic/OpenAI can do it. And the next day everyone will be complaining how much Claude/Codex has been dumbed down. They don't even comply to the context anymore!
Three persistent Claude instances share AMQ with an additional Memory Index to query with an embedding model (that I'm literally upgrading to Voyage 4 nano as I type). It's working well so far, I have an instance Wren "alive" and functioning very well for 12 days going, swapping in-and-out of context from the MCP without relying on any of Anthropic's tools.
And it's on a cheap LXC, 8GB of RAM, N97.
I just make stuff to share with others, so yeah, good point.
Maybe there’s a way to play around with this idea in pi. I’ll dig into it.