upvote
The harness is super important, what tools are available and the system prompts vary from harness to harness.

Anthropic seems to have a modest lead on their harness and models, so it’s a best-of-both-worlds scenario.

> I'm not sure what Microsoft is doing behind the scenes

It’s probably the exact same model, but the tools and the prompts around it are worse, so you get worse results.

reply
Claude in Claude code has been shown to perform persistently worse in evals than claude + a minimal harness.
reply
The harness was absolutely not an issue in my case.

The new pricing model where I got banned from using Opus entirely and half a day of work (with weaker models) consumed the 10$ plan was.

I'm now using a Claude Max subscription and I can get close to the daily limits but I'm fairly happy with the overall plan consumption.

reply
So if you use Claude via Copilot in Zed... You use Zed's harness, I think? What does Copilot do, at that point?
reply
I believe you are using https://github.com/github/copilot-cli or potentially this https://github.com/github/copilot-language-server-release#ag... via the Agent Client Protocol https://github.com/agentclientprotocol/agent-client-protocol which means you are indeed using Copilot's harness

ACP is just a standard that bridges harnesses easily into IDEs, Text Editors, or whatever consumes it (I wrote a TUI that consumes them)

The registry for all the agents (tool harnesses) is here https://github.com/agentclientprotocol/registry if you ever are curious to what Zed or IntelliJ are really hooking into

reply
Ah OK, so the ACP connector ensures tool calls work with Zed, and communicates the available tools and their results to the harness, and then the harness mainly provides a system prompt and the API calls?
reply
It’s providing the inference of Anthropic models
reply
I had a similar experience moving away from Copilot within Zed. Now using the reasonix harness for Deepseek that makes cache hits almost free. And that's with unsubsidized American providers like Digital Ocean or Cloudflare.
reply
I tried using Zed but with local models it constantly breaks on tool calls. I wanted to like it but the smell of vibing is just too much.
reply
Likewise, and that's with state of the art technology. I wish a true self-contained binary for Reasonix Desktop was released, for now I have to settle for providing a Flake.nix environment. It isn't nearly as fickle as Zed, but I wish they leveraged that power of the Go toolset more.
reply
You using models released this year? I hear this complaint a lot, and it's often due to using an old model which is not as good at tool calling as newer models.
reply
What I noticed is that when the conversation starts the agent is pretty able to read from and write to files. As the conversation continues (and maybe sub agents are spawned) it forgets how to do this, complains, tries to resort to running shell or python code, sometimes it works. Sometimes it asks me to execute the code. If I refuse and point out it worked before than sometimes it remembers how to write, but mostly not and I need to start a new session.

When using Zed with the CoPilot integration I use Claude Opus and never had this issue.

reply
Qwen 3.6 and 3.5...
reply
Yep reasonix is an absolute case study of caching. They literally compiled byte level cache in their design and it is insane. i can one shot many workflows, apps in under 0.05 cents.
reply
Nice.

I paid $6 yesterday for DeepSeek V4 Flash on OpenRouter. That's like $120 dollar for a month, and it's not even a good model.

reply
For DS4 it's much cheaper and reputable to use OpenCode Go $10/mo subscription, or directly with DeepSeek API.
reply
Sometimes $10 is more than I'll do with API tokens. I prefer the top up scheme for peace of mind, but the deal does sound generous. The only concern is sustainability, similar to subsidized copilot pricing having to change.
reply
Thanks!

I'll try that.

reply
That's quite an achievement, I managed to spend only 2$ on 16 different tasks of v4 pro.
reply
Yeah, v4 flash is dirt cheap, but it's running in circles quite often.

Might very well be that a better model is cheaper if it gets things right the first try.

Maybe I should route to a better model when v4flash hasn't solved after a specific number of tokens.

reply
I'm having great success with DS4 Pro as my main model, while using DS4 Flash for subagents.
reply
What is the average monthly token price for daily reasonix use?
reply
For me it's about $5 of work, where I've done equivalent work for about $200.
reply
Same ,I switched to cursor. I told it how to invoke msbuild and it can edit away without needing a native Visual studio plugin.. no problems at all. Target language c++
reply
GitHub Copilot costs have ballooned in recent week, what once took $100 requires $300. I like using Claude with VS Code through Copilot and I feel it’s given me much better code, that I can control the quality. It’s much more transparent than Claude Code. It’s open source but and the IDE interface gives so many more features to have you context and control over whats generated. The increase in cost isn’t purely due to their price increases but also the Opus models agents use more tokens. So I’ve moved to Claude Code and I’m happily still using Opus 4.6. Fable and 4.7 seem to do much larger units of work, go off on tangents and make assumptions that frequently results in slop.
reply
My copilot quota finished in maybe 2-3 prompts with claude 4.8 opus. i was expecting it to suck but not this bad. it was good while it lasted though
reply