This is probably far from the raw intelligence provided by cloud providers.
Still, this shines more light on local LLMs for agentic workflows.
Are there any architectures that don't rely on feeding the entire history back into the chat?
Recurrent LLMs?