upvote
With Anthropic, you either pay per token with an API key (expensive), or use their subscription, but only with the tools that they provide you - Claude, Claude Cowork and Claude Code (both GUI and CLI variants). Individuals generally get to use the subscriptions, companies, especially the ones building services on top of their models, are expected to pay per token. Same applies to various third party tools.

The belief is that the subscriptions are subsidized by them (or just heavily cut into profit margins) so for whatever reason they're trying to maintain control over the harness - maybe to gather more usage analytics and gain an edge over competitors and improve their models better to work with it, or perhaps to route certain requests to Haiku or Sonnet instead of using Opus for everything, to cut down on the compute.

Given the ample usage limits, I personally just use Claude Code now with their 100 USD per month subscription because it gives me the best value - kind of sucks that they won't support other harnesses though (especially custom GUIs for managing parallel tasks/projects). OpenCode never worked well for me on Windows though, also used Codex and Gemini CLI.

reply
>or perhaps to route certain requests to Haiku or Sonnet instead of using Opus for everything, to cut down on the compute

You can point Claude Code at a local inference server (e.g. llama.cpp, vLLM) and see which model names it sends each request to. It's not hard to do a MITM against it either. Claude Code does send some requests to Haiku, but not the ones you're making with whatever model you have it set to - these are tool result processing requests, conversation summary / title generation requests, etc - low complexity background stuff.

Now, Anthropic could simply take requests to their Opus model and internally route them to Sonnet on the server side, but then it wouldn't really matter which harness was used or what the client requests anyway, as this would be happening server-side.

reply
Sounds pretty sane, the same way how OpenWebUI and probably other software out there also has a concept of “tool models”, something you use for all the lower priority stuff.

Actually curious to hear what others think about why Anthropic is so set on disallowing 3rd party tools on subscriptions.

reply
The sota models are largely undifferentiated from each other in performance right now. And it’s possible open weight models will get “good enough” relatively soonish. This creates a classic case where inference becomes a commodity. Commodities have very low margins. Training puts them in an economic hole where low margins will kill them.

So they have to move up the stack to higher margin business solutions. Which is why they offer subsidized subscription plans in the first place. It’s a marketing cost. But they want those marketing dollars to drive up the stack not commodity inference use cases.

reply
Anthropic's model deployments for Claude Code are likely optimized for Claude Code. I wouldn't be surprised if they had optimizations like sharing of system prompt KV-cache across users, or a speculative execution model specifically fine-tuned for the way Claude Code does tool calls.

When setting your token limits, their economics calculations likely assume that those optimizations are going to work. If you're using a different agent, you're basically underpaying for your tokens.

reply
- OR - it's about lock-in.

Build the single pane of glass everyone uses. Offer it under cost. Salt the earth and kill everything else that moves.

Nobody can afford to run alternative interfaces, so they die. This game is as old as time. Remember Reddit apps? Alternative Twitter clients?

In a few years, CC will be the only survivor and viable option.

It also kneecaps attempts to distill Opus.

reply
It’s probably a mixture of things including direct control over how the api is called and used as pointed out above and giving a discount for using their ecosystem. They are in fact a business so it should not surprise anyone they act as one.
reply
It might well be a mixture, but 95% of that mixture is vendor lock in. Same reason they don't support AGENTS.md, they want to add friction in switching.
reply
They can try add as much as friction they want. A simple rename in the files and directories like .claude makes the thing work to move out of CC.

It’s not like moving from android to iOS.

reply
You'd be surprised how effective small bits of friction are.
reply
If it was lock in they wouldn't make it absolutely trivial to change inference providers in Claude Code.
reply
It's very straightforward to instrument CC under tmux with send-keys and capturep. You could easily use that for distillation, IMO. There are also detailed I/O logs.
reply
[dead]
reply
Subscription = token that requires refreshing 1-2x/day, and you get the freedom to use your subscription-level usage amount any way you want.

API = way more expensive, allowed to use on your terms without anthropic hindering you.

reply
Also, Subscription: against the TOS of Claude Code, need to spoof a token and possibly get banned due to it.
reply
Anthropic has an API, you can use any client but they charge per input/output/cache token.

One-price-per-month subscriptions (Claude Code Pro/MAX @ $20/$100/$200 a month) use a different authentication mechanism, OAUTH. The useful difference is you get a lot more inference than you can for the same cost using the API but they require you to use Claude Code as a client.

Some clients have made it simple to use your subscription key with them and they are getting cease and desist letters.

reply
about 30 times more cost
reply