It's because the subscriptions force you to do so. The subscriptions are the most economical way to use e.g. Claude by close to an order of magnitude. If you max out a 20x plan every week, doing the same work with the API would cost you well into the four figures.
Anyone already using the Claude API pricing and using CC over OpenCode is kneecapping themselves.
The immediate thing I've noticed: I get way more out of the codex $100 plan than I was getting out of the Anthropic $200. Like, probably 2x at least.
The other think I've noticed: when using strict guardrails, TDD, reviews etc. I cannot notice any quality difference. Not only between Opus and Codex but even between the most recent models - GPT 5.3 code, GPT 5.4, and now GPT 5.5.
Well, 5.5 uses a huge amount of my session limits. 5.3 is very light, 5.4 somewhere in between. So now I use 5.4 for the main session/debugging/planning and then execute with 5.3.
Regarding usage, of course, it's hard to say how much is the model and how much is coming from Claude code and all this ridiculous malware scanning.
But it's nice to use a lightweight harness like pi and see that even with all my personal instructions, a good bunch of skills, custom tools etc., if I start a session and say "hi" I'm starting out with about 15k of context used. I think a closely equivalent setup in CC would start at 30-40k context.
5.5 has been a noticeable improvement over 5.4, solving more complicated issues and faster too.
5.5 does not use a huge amount of my session limits with the $100 plan.
I use multiple conversations in parallel, all on xhigh effort with Fast on (2.5x consumption), and it’s still enough for me not to switch off Fast.
It also runs my tests, but I did not use TDD apart from sometimes telling it to cover an issue in a test before fixing it.
If you want to plug your API keys into a third-party harness, that's totally cool and honestly, I'm looking into doing that right now and I haven't used any of the first-party harnesses at all. But the first time I accidentally spend $300 in a day I may be thinking about how a $20/month plan might be pretty good even if performance is inconsistent, at least I know what my costs are.
It aligns the incentives for faster, cheaper, terse and more reliable models, because the model providers pay the wasted tokens and electricity costs.
Did you mean 100 billion tokens because 100k isn't a big deal at all?
the best performing and capable ones are all the ones that aren't tied to a specific api.
However nobody is agreeing with that, that's how it's done, and move faster faster, because of goldrush! faster!@@@!