undefined

points

by grewil24 hours ago |

comments

by landr0id4 hours ago|

[-]

Relevant: https://www.reddit.com/r/ClaudeAI/comments/1s7zgj0/investiga...

https://www.reddit.com/r/ClaudeAI/comments/1s7mkn3/psa_claud...

by onemoresoop39 minutes ago|

parent|

[-]

That explains things. Im getting this: API Error: 400 {"error":{"message":"Budget has been exceeded! Current cost: 271.29866200000015, Max budget: 200.0","type":"budget_exceeded","param":null,"code":"400"}}

So I completetly ran out of tokens and haven’t even used it at all for the past couple of days, and last week my usage was very light. Let me scratch that, all my usage has been very light since I got this plan at work. It’s a an enterprise subscription I believe, hard to tell since it doesn’t connect directly to Anthropic, rather it goes through a proxy on Azure.

Im not liking this at all and all, so flaky and opaque. Not possible to get a breakdown on what the usage went on, right? Do we have to contact Anthropic for a refund or will they restore the bogus usage?

by nixpulvis2 hours ago|

parent|

prev|

[-]

This is a serious problem with the fact that it's nearly impossible to understand what a "token" is and how to tame their use in a principled way.

It's like if cars didn't advertise MPG, but instead something that could change randomly.

by clawfund1 hours ago|

parent|

[-]

The opacity isn't accidental. When customers can't forecast cost, they can't build an internal business case to switch, which is more durable than lock-in from product quality. The first AI coding vendor to offer deterministic $/task pricing will own enterprise procurement conversations. Right now the entire market is hiding behind 'tokens' specifically to avoid that comparison.

by claw-el1 hours ago|

parent|

prev|

[-]

Also, certain models are more verbose than the others. We are basically at the mercy of a model who likes to ramble a lot.

by konfusinomicon37 minutes ago|

parent|

[-]

im fiarly certain the knob on the machine that controls length of redundant comments and docblocks is cranked to 11. it makes me curious how much of their bottom line is driven by redundant comment output.

by amitprasad1 hours ago|

parent|

prev|

[-]

Relevant post: https://modal.com/blog/dollars-per-token-considered-harmful

(disclaimer: I work with the author)

by nixpulvis1 hours ago|

parent|

[-]

I completely agree that requests are what should be charged for. But I think there are two things, given that requests aren't all going to cost the same amount:

1. Estimate free invoicing the requests and letting users figure it out after the fact. 2. Somehow estimating cost and telling users how much a request will cost.

We have 1, we want 2.

by uoaei2 hours ago|

parent|

prev|

[-]

Like if cars measured fuel efficiency or range using the knobs in the tread on your tire.

by conception4 hours ago|

prev|

[-]

I noticed 1M context window is default and no way not to use it. If your context is at 500-900k tokens every prompt, you’re gonna hit limits fast.

by Wowfunhappy3 hours ago|

parent|

[-]

I had to double check that they'd removed the non-1M option, and... WTF? This is what's in `/config` → `model`

    1. Default (recommended)    Opus 4.6 with 1M context · Most capable for complex work
    2. Sonnet                   Sonnet 4.6 · Best for everyday tasks
    3. Sonnet (1M context)      Sonnet 4.6 with 1M context · Billed as extra usage · $3/$15 per Mtok
    4. Haiku                    Haiku 4.5 · Fastest for quick answers

So there's an option to use non-1M Sonnet, but not non-1M Opus?

Except wait, I guess that actually makes sense, because it says Sonnet 1M is billed as extra usage... but also WTF, why is Sonnet 1M billed as extra usage? So Opus 1M is included in Max, but if you want the worse model with that much context, you have to pay extra? Why the heck would anyone do that?

The screen does also say "For other/previous model names, specify with --model", so I assume you can use that to get 200K Opus, but I'm very confused why Anthropic wouldn't include that in the list of options.

What a strange UX decision. I'm not personally annoyed, I just think it's bizarre.

by retrofuturism2 hours ago|

parent|

[-]

`/model opus` sets it to the original non-1M Opus... for now.

by windexh8er1 hours ago|

parent|

[-]

Thanks. I quickly burned through $100 in credit when I started using Opus 4.6 in OpenCode via OpenRouter. My session stopped and was getting an error not representative of credit availability, so was surprised after a few minutes when I finally realized Opus just destroyed those credits on a bullshit reasoning loop it got stuck in. Anthropic seems to know that the expanded context is better for their bottom line as they've defaulted it now.

And as others have said it's very easy to burn token usage on the $100/month plan. It's getting to the point where it's going to very much make sense to do model routing when using coding tooling.

by nextaccountic41 minutes ago|

parent|

prev|

[-]

do you pay for the full context every prompt? what happened with the idea of caching the context server side?

by davesque27 minutes ago|

parent|

[-]

You don't. Most of the time (after the first prompt following a compaction or context clear) the context prefix is cached, and you pay something like 10% of the cost for cached tokens. But your total cost is still roughly the area under a line with positive slope. So increases quadratically with context length.

by aberoham3 hours ago|

parent|

prev|

[-]

export CLAUDE_CODE_DISABLE_1M_CONTEXT=1

by teaearlgraycold3 hours ago|

parent|

[-]

Anthropic is not building good will as a consumer brand. They've got the best product right now but there's a spring charging behind me ready to launch me into OpenCode as soon as the time is right.

by kylecazar3 hours ago|

parent|

[-]

Would you use Opus if you switched to OpenCode?

by teaearlgraycold3 hours ago|

parent|

[-]

I'd like to use Opus with OpenCode right now to combine the best TUI agent app with the best LLM. But my understanding is Anthropic will nuke me from orbit if I try that.

by joecot2 hours ago|

parent|

[-]

You can use Opus with OpenCode anytime you want, just not with the Claude plan. You can use it via API with any provider, including Anthropic's API. You can use it with Github Copilot's plan. The only thing you can't do without getting banned is use OpenCode with one of Claude's plans.

by corford3 hours ago|

parent|

prev|

[-]

OpenCode with a Copilot Business sub and Opus 4.6 as the model works well

by teaearlgraycold26 minutes ago|

parent|

[-]

I'm looking at their plans (https://github.com/features/copilot/plans) it seems like the limits might be pretty low, even with the Pro+ plan which is 2x the cost of Claude Pro. It seems like Claude Pro might be 10-20x the Opus tokens for only twice the price.

by zhangchen39 minutes ago|

parent|

prev|

[-]

[dead]

by no1youknowz4 hours ago|

prev|

[-]

I've been jumping from Claude -> Gemini -> GPT Codex. Both Claude and Gemini really reduced quotas and so I cancelled. Only subbed GPT for the special 2x quota in March and now my allocation is done as well.

I decided to give opencode go a try today. It's $5 for the first month. Didn't get much success with Kimi K2, overly chatty, built too complex solutions - burned 40% of my allocation and nothing worked. ¯\_(ツ)_/¯.

But Minimax m2.7. Wow, it feels just like Claude Opus 4.6. Really has serious chops in Rust.

Tomorrow/Wednesday will try a month of their $40 plan and see how it goes.

by victorbjorklund3 hours ago|

parent|

[-]

Minimax 2.7 is great. Not close to Claude but good enough for a lot of coding tasks.

by girvo55 minutes ago|

parent|

[-]

GLM-5 (and 5.1) is surprisingly impressive too I’m finding.

by HDBaseT3 hours ago|

parent|

prev|

[-]

[dead]

by xantronix44 minutes ago|

prev|

[-]

This is a very normal thing to be the top comment on an article on how to use Claude Code.

by lkbm2 hours ago|

prev|

[-]

I've heard this a few times lately, but this past weekend I built a website for a friend's birthday, and it took me several hours and many queries to get through my regular paid plan. I just use default settings (Sonnet 4.6, medium effort, thinking on).

I'm guessing Opus eats up usage much, much faster. I don't know what's going on, since a lot of people are hitting limits and I don't seem to be.

by notatoad2 hours ago|

parent|

[-]

what they changed was peak vs off-peak usage metering.

using it on the weekend gets you more use than during weekdays 9-5 in US eastern time.

by matheusmoreira1 hours ago|

parent|

[-]

I waited until off peak hours to use Opus 4.6 to do some research. One prompt consumed 100% of my 5h limit and 15% of my weekly usage. Even off peak it's still insane. Opus didn't even manage to finish what it was doing.

by hrimfaxi2 hours ago|

parent|

prev|

[-]

I'm surprised it's during east coast working hours and not west coast.

by notatoad2 hours ago|

parent|

[-]

the speculation i read was that it's trading hours, and they're getting a lot of load from the finance industry

by lkbm2 hours ago|

parent|

prev|

[-]

Technically, this was Friday morning, so I think I was still in peak hours.

by teaearlgraycold1 hours ago|

parent|

prev|

[-]

Even with Opus I don’t usually hit limits on the standard plan. But I am not doing professional work at the moment and I actually alternate between using the LLM and reading/writing code the old fashioned way. I can see how you’d blow through the quota quickly if you try to use LLMs as universal problem solvers.

by outside12342 hours ago|

prev|

[-]

They need to get to profitability because that sweet sweet Saudi subsidy cash is gone gone.

by kderbyma1 hours ago|

parent|

[-]

They wont be profitable at this point...they just dont realise they are eating their own tail.

by manmal4 hours ago|

prev|

[-]

Looks like they are falling victim to their own slop. This smells a lot like the Amazon outages caused by mandated clanker usage.

by maximinus_thrax1 hours ago|

prev|

[-]

I'm very surprised to see enshittification starting so early. I was expecting at last 3-4 years of VC subsidized gravy train.

by kderbyma1 hours ago|

parent|

[-]

This has been 6 months of constant decline so at this point I am wondering when they cliff it like wework

by skwallace363 hours ago|

prev|

[-]

things are rough out there right now

by irishcoffee2 hours ago|

prev|

[-]

Reminds me of when I would mess with my friends on "pay per text" plans by sending them 10 text messages instead of just 1. I should start paying attention to unattended laptops and blow up some token usage in the same manner.

It's almost like an evolution of bobby tables.

by LeonTing101016 minutes ago|

prev|

[-]

[dead]

by alcor-z29 minutes ago|

prev|

[-]

[dead]