But if you already have agent loops dialed in (For example I have one that uses a browser testing framework), it wouldn’t really affect me at all if it crunched away at 7 tokens per second all night long.
Sure, it cannot replace SOTA models for agentic coding, except for small, well-scoped refactorings. But even a model like ministral-3:8b or qwen3.5:9b is a boon for so many smaller use cases!