For there to be any trust in the above, the tool needs to behave predictably day to day. It shouldn't be possible to open your laptop and find that Claude suddenly has an IQ 50 points lower than yesterday. I'm not sure how you can achieve predictability while keeping inference costs in check and messing with quantization, prompts, etc on the backend.
Maybe a better approach might be to version both the models and the system prompts, but frequently adjust the pricing of a given combination based on token efficiency, to encourage users to switch to cheaper modes on their own. Let users choose how much they pay for given quality of output though.
Frontier LLMs still suck a lot, you can't afford planned degradation yet.
Right now my solution is to run CC in tmux and keep a 2nd CC pane with /loop watching the first pane and killing CC if it detects plan mode being bypassed. Burning tokens to work around a bug.
if only there were a place with 9.881 feedbacks waiting to be triaged...
and that maybe not by a duplicate-bot that goes wild and just autocloses everything, just blessing some of the stuff there with a "you´ve been seen" label would go a long way...
Or improve performance and efficiency, if we’re generous and give them the benefit of the doubt.
It makes sense, in a way. It means the subscription deal is something along the lines of fixed / predictable price in exchange for Anthropic controlling usage patterns, scheduling, throttling (quotas consumptions), defaults, and effective workload shape (system prompt, caching) in whatever way best optimises the system for them (or us if, again, we’re feeling generous) / makes the deal sustainable for them.
It’s a trade-off
It may be (but I wouldn’t know) that some of other changes not covered here reduced costs on their side without impacting users, improving the viability of their subscription model. Or maybe even improved things for users.
I’d really appreciate more transparency on this, and not just when things fail.
But I’ve learned my lesson. I’ve been weening off Claude for a few weeks, cancelled my subscription three weeks ago, let it expire yesterday, and moved to both another provider and a third-party open source harness.
If you worry about "degraded" experience, then let people choose. People won't be using other wrappers if they turn out to be bad. People ain't stupid.
> On April 16, we added a system prompt instruction to reduce verbosity. In combination with other prompt changes, it hurt coding quality, and was reverted on April 20. This impacted Sonnet 4.6, Opus 4.6, and Opus 4.7
They can pick the default reasoning effort:
> On March 4, we changed Claude Code's default reasoning effort from high to medium to reduce the very long latency—enough to make the UI appear frozen—some users were seeing in high mode
They can decide what to keep and what to throw out (beyond simple token caching):
> On March 26, we shipped a change to clear Claude's older thinking from sessions that had been idle for over an hour, to reduce latency when users resumed those sessions. A bug caused this to keep happening every turn for the rest of the session instead of just once, which made Claude seem forgetful and repetitive. We fixed it on April 10. This affected Sonnet 4.6 and Opus 4.6
It literally is all in the post.
I don't worry about anything though. It's not my product. I don't work for Anthropic, so I really couldn't care less about anyone else's degraded (or not) experience.
They control the default system prompt. You can change it if you want to.
> They can pick the default reasoning effort
Don't see how it's an obstacle in allowing third party wrappers.
> They can decide what to keep and what to throw out
That's actually a good point. However I still don't think it's an obstacle. If third party wrappers were bad, people simply wouldn't be using them.
Defaults matter. A large share of people never change them (status quo bias, psychological inertia). Having control over them (and usage quotas) means Anthropic can control and fine-tune what this fixed subscription costs them.
And evidently (re, the original article), they tried to do so.
Allowing third party wrappers doesn't mean Claude Code would cease to exist. The opposite actually, Claude Code would be the default.
People dissatisfied with Code would simply use other wrappers. I call it a win-win. Don't see how Anthropic would be on a lose here, they would still retain the ability to control the defaults.
I have no idea what the share of OpenClaw instances running on pi was, or third-party wrappers in general, but it was obviously large enough that Anthropic decided they had to put an end to it.
Conversely, from the latest developments, it would seem they are perfectly fine with people running OpenClaw with Claude models through Claude Code’s programmatic interface using subscriptions.
But in the end, this, my take, your take, is all conjecture. We are both on the outside looking in.
Only the people who work at Anthropic know.
A reminder: your vibe-coded slop required peak 68GB of RAM, and you had to hire actual engineers to fix it.
... But then again, many of us are paying out of pocket $100, $200USD a month.
Far more than any other development tools.
Services that cost that much money generally come with expectations.
A month prior their vibe-coders was unironically telling the world how their TUI wrapper for their own API is a "tiny game engine" as they were (and still are) struggling to output a couple of hundred of characters on screen: https://x.com/trq212/status/2014051501786931427
Meanwhile Boris: "Claude fixes most bugs by itself. " while breaking the most trivial functionality all the time: https://x.com/bcherny/status/2030035457179013235 https://x.com/bcherny/status/2021710137170481431 https://x.com/bcherny/status/2046671919261569477 https://x.com/bcherny/status/2040210209411678369 while claiming they "test carefully": https://x.com/bcherny/status/2024152178273989085
Once OpenAI added the $100 plan, it was kind of a no-brainer.