undefined

upvote

points

by my00221 hours ago |

upvote

by giwook20 hours ago|

[-]

Lots of us have noticed that usage limits for Claude have been nerfed in recent weeks/months.

If anything, these new multipliers are more transparent than anything OpenAI or Anthropic have communicated regarding actual costs and give us a more realistic understanding of what it's costing these providers.

The fact that we were able to get such a substantial amount of usage for $20/$100/$200 a month was never meant to last and to think otherwise was perhaps a bit naive.

This feels like a strategy from the ZIRP era of tech growth where companies burned investor capital and gave away their products and services for free (or subsidized them heavily) in order to prioritize user acquisition initially. Then once they'd gained enough traction and stickiness they'd then implement a monetization strategy to capitalize on said user base.

reply

upvote

by dualvariable20 hours ago|

[-]

However, inference costs for entirely good enough models are likely to keep declining in the future. We're probably hitting diminishing returns on model size and training. The new generations aren't quantum leaps anymore, and newer generations of open source models like DeepSeek are likely to start getting good enough.

There's going to be a limit to how much they can raise prices, because someone can always build out a datacenter and fill it up with open source DeepSeek inference and undercut your prices by 10x while still making a very good ROI--and that's a business model right there. Right now I'm sure there's a lot of people who will protest that they couldn't do their jobs with lesser models, but as time goes on that will get less and less. Already right now the consumers who are using AI for writing presentations, cooking recipe generation and ELI5 answers for common things, aren't going to be missing much from a lesser model. That'll actually only start to get cheaper over time.

Also for business needs, as AI inference costs escalate there comes a point where businesses rediscover human intelligence again, and start hiring/training people to do more work to use lesser models--if that is more productive in the end than shelling out large amounts of cash for inference on the latest models. [Although given how much companies waste on AWS, there's a lot of tolerance for overspending in corporations...]

reply

upvote

by geodel19 hours ago|

[-]

> because someone can always build out a datacenter and fill it up with open source DeepSeek inference and undercut your prices by 10x while still making a very good ROI-

Not sure how it all works out. Currently trillion dollar companies can't make a native app for platforms. Everything is just JS/Electron because economics does not work for them.

And here companies can make GW data center running very expensive GPUs for 1/10th of current prices. Sound little fanciful to me.

reply

upvote

by bootsmann18 hours ago|

[-]

The price you pay for anthropic must include the price of training new and better models which is incredibly costly. If you use the models someone else already spend money to develop you don’t need to pay this price.

reply

upvote

by croes19 hours ago|

[-]

I guess the new models will still be quantum leaps, but literally: "The smallest possible change in a system"

reply

upvote

by hansmayer3 hours ago|

[-]

They've been like that for a while actually, I think at least since the big hype around ChatGPT 4.5 (or was it 5?) and that underwhelming, lukewarm, oversanitised presentation by Altman and his team.

reply

upvote

by ctoth18 hours ago|

[-]

Yups... Mythos is the smallest possible leap. Not a standard model generation advance, not even a version point advance. Just the smallest possible quanta of a change. We are absolutely hitting a plateau any day now. Any day. Any time. Any second now. Yup. Right now! Surely!

reply

upvote

by hansmayer3 hours ago|

[-]

I mean let's be realistic - all that we know about the "mythical" Mythos is the carefully curated and release stuff by the Anthropic's PR team. Is it really a huge leap they are making it to be? I doubt it. In fact I bet if it was indeed that powerful and dangerous, as they imply, they'd find a way to release it immediately, devastate OpenAI and DeepSeek and secure a leading position in the market. Why is it not happening? I suspect because Dario is again at it, peddling his bullshit.

reply

upvote

by cubefox18 hours ago|

[-]

Yeah. AI progress is insanely fast if you compare it to anything else. Where else is a one year old technology already hopelessly outdated? 10 years ago is basically stone age.

reply

upvote

by madamelic17 hours ago|

[-]

I am continually tripped out by the fact when I was 16, I didn't have a 'smartphone' beyond a Windows Mobile 6 phone that had no internet on it.

Now, I have this high-resolution shiny object that can near instantaneously get any information I want along with _streaming HD video to it_ *anywhere*.

15 years even feels like a stone age. I can't fathom what it has to feel like people in their 60s and 70s.

reply

upvote

by nonameiguess17 hours ago|

[-]

I'm not quite 60, but it's always interesting to me that I feel quite the opposite of this. When I was 16, I didn't have a computer, didn't have a phone, had never used the Internet, but when I think of how life has changed, it's frankly not much. I woke up this morning, scooped my cats' litter boxes, took out some trash, made myself breakfast, ate that, read some news while eating, then lifted weights in my garage, had some work meetings, wrote up some instructions per a customer request from Friday, and am about to go drive to the lake to go do a 9 mile longboard loop.

That's very close to a normal day in 1996. The biggest difference is I read the news on my phone instead of a physical newspaper. The news was not any more interesting or informative because of that. I guess I can also still do the loop reasonably well, but I'm a lot slower than I was in 1996 when I was a cross-country state champion.

My parents are closing in on 70 and I guess I can't speak for them, but I'm at least aware of the daily routines of their lives, too. Walk the dog, do housework, DIY building projects, visit kids and grankids. Seems much the same, too, with the biggest difference being they're now teaching my sister's sons to play baseball rather than me, but shit, one of her sons even looks like exactly the same way I looked when I was 7! The more things change, the more they stay the same.

reply

upvote

by jjkaczor13 hours ago|

[-]

General agree... I still do the things (mid-50's) I used to do when I was a teenager with no computer, no phone.

But - now they are easier - I can read books on an e-ink screen and pretty much instantly find what I want to read next. I get my news on a phone. I used to watch TV/movies broadcast or on tape rentals. Now, I have just about everything I could ever want available - without ADs... those were such a time-waster.

What has changed is that I have access to MORE information than my local (or school) libraries could ever provide - in a variety of more accessible formats. Whatever tools I need to get "work done", I can find a myriad of free and open-source options.

But - the overall days and household family routines are the same - now, instead of reading a paper book while waiting to pickup my kids (or other family members) "back-in-the-day", I can read my device, or connect with my DIY communities online on my phone - or learn something new. I don't have to schedule life around major broadcast events, I can easily do many tasks while I am "out-and-about".

Friction has been reduced.

reply

upvote

by rootusrootus16 hours ago|

[-]

If your parents are closing in on 70, I would have expected you to be closer to not quite 50 than not quite 60.

I am just over 50 myself and I agree with your points. Technology has changed but life is largely very similar to wear it was in the 90s. At least day to day. Attitudes are way worse now.

reply

upvote

by madamelic15 hours ago|

[-]

Thank you for this insight!

I always wonder the views of older people. My parents are very technology forward and have been my entire life so it is difficult to gauge how different life is compared to when they were growing up.

It's easy to hear "Oh well I only had 640kb of memory and typed programs out of a magazine I got in the mail!" and see as distinct from having 'unlimited' resources and the internet.

Your insight is good ("The biggest difference is I read the news on my phone instead of a physical newspaper") that life sort of stays the same but the modality changes. People still go to the store like they did in the mid-1800s but now it is by car.

I wonder what our "industrial revolution" will be where the previous generation lived (ie: out in the country on a farm) totally different lives to the current (ie: in the city in a factory). Maybe when space travel and multi-planetary living is normalized?

reply

upvote

by saulpw14 hours ago|

[-]

> It's easy to hear "Oh well I only had 640kb of memory and typed programs out of a magazine I got in the mail!"

Since I was there (young, but there), I want to point out that this crosses three eras which all felt quite different:

    1978: typed programs in from a magazine or loaded from a cassette (16kB, TRS-80)
    1983: loaded programs from a floppy (64kB, Apple ][ and C64 etc)
    1988: loaded programs from a hard disk (640kB, IBM PC and Mac).

Exact years vary but these eras were only about 5 years each. Nobody had a floppy in 1978 but almost computer user did by 1983; nobody had a hard drive in 1983 but almost everyone did by 1988.

reply

upvote

by bobthepanda14 hours ago|

[-]

To some degree this already happened with the move from the industrial city to suburbanization and then re-urbanization. In particular one of the most notable recent developments is that urban waterways are now pretty desirable places to be with parks and recreation; in most industrializing cities the waterfront was actively avoided because the industrial use made it polluted, smelly etc.

reply

upvote

by zdragnar17 hours ago|

[-]

Depends on where you live. My dad is almost 80, grew up in a very rural area, and when he was 16 they'd just gotten indoor plumbing. Up until he was 14, his school was a one-room school house with no heating other than a wood stove. If you were the first kid to arrive for the day, it was your job to get the fire going in winter months.

Housework meant no laundry machine, no dishwasher, and possibly no vacuum cleaner. That means hand washing everything, and beating rugs with sticks and brushes to get the dust off of them.

reply

upvote

by leoedin1 hours ago|

[-]

The early lives of my grandparents (in their 90s) are so fascinatingly different to that of mine. But even by the time my parents were growing up in the 60s, life was not so different in the west. The real differentiators in living standards - energy, household appliances and cooking, modes of transport - were more or less figured out then. By the time my parents were young adults in the early 80s, so many of the aspects of "modern life" had been figured out.

I look at the life my kids live, and it's not so different to my childhood. The toys are similar, their housing is similar. Probably the biggest difference is the availability of content on demand rather than much more fixed TV schedules.

The big difference in the last 30 years hasn't so much been in the kind of middle class life you can live, but the number of people who live that kind of life. In the 90s 40% of people globally were living in extreme poverty. Now its under 10%. The kinds of lives the middle class live in China and Vietnam are closer to those of Europeans today, when even 30 years ago most people in those countries were living much closer to the way your dad grew up.

I wonder if AI will result in a step change of living standards? Perhaps along with robotics we'll finally get to do nothing at all at home? I'm not convinced it'll be quick though. Maybe another 30 years.

reply

upvote

by gzread12 hours ago|

[-]

The news on the phone is worse, in fact.

reply

upvote

by giwook19 hours ago|

[-]

I think so too.

And at some point even frontier model costs will hopefully come down (if there is still a meaningful difference between closed and open source models at that point) as all of the compute that's being built out right now comes online.

reply

upvote

by Fire-Dragon-DoL20 hours ago|

[-]

I hope it's true, but right now hardware prices are insane

reply

upvote

by hirako200019 hours ago|

[-]

It does feel like the music is about to stop.

It has been years now, of cash injections, investors can't keep feeding the beast forever.

reply

upvote

by Gigachad15 hours ago|

[-]

This is the best AI programming will be. From here on the enshitification starts and the prices go up.

reply

upvote

by ochronus8 hours ago|

[-]

As predicted by many. The math is, as usual, mathing.

reply

upvote

by ctoth18 hours ago|

[-]

It has been years now of reading this same comment... Surely people can't keep typing it forever.

reply

upvote

by stuartq17 hours ago|

[-]

But the prices haven't been going up by multiples of 6 for the past few years. Things are actually changing now. I don't think it's over, but in the short term, it's going to be considerably more expensive.

reply

upvote

by hirako200015 hours ago|

[-]

They will smooth up the spike. Or be subtle and transform the existing quota so that they run out more quickly. Calling it caching, compression, optimisation, of course for the sacred benefit of the users.

That would be, even is, the smart thing to do.

reply

upvote

by Applejinx3 hours ago|

[-]

And it didn't really get flawless, did it? All the same objections stand, but the cost is inevitably blowing up for the same kinda jank product.

reply

upvote

by soraisdead12 hours ago|

[-]

The difference is we're now in a world where Disney has pulled out of OpenAI without comming, and Sora was dropped off a ditch.

In other words.

The bubble has burst. You're just in denial.

reply

upvote

by hirako200055 minutes ago|

[-]

My read is that the bubble as burst internally (angels, seeds, VCs, and even corporate got a grasp of the inflated promise). It will take while for the actual bubble to implode.

reply

upvote

by tclancy13 hours ago|

[-]

I’m not willing too, but I can set up a cron job to Claude -p the task.

reply

upvote

by stefan_16 hours ago|

[-]

Dunno, if in this day and age you are making inference more expensive, more scarce, you are honestly moving in the wrong direction and DeepSeek and others will gladly take your lunch.

reply

upvote

by Gigachad15 hours ago|

[-]

The hardware to run deepseek is still incredibly expensive.

reply

upvote

by cheema3314 hours ago|

[-]

> The hardware to run deepseek is still incredibly expensive.

Deepseek API pricing is very low compared to Anthropic/OpenAI API pricing.

For many, the 300% difference in pricing may be difficult to justify, if the quality difference is very small. And there will be many tasks where the most expensive/the best model, is not needed. Currently many people end up using Opus 4.7/GPT 5.5 for many tasks without thinking about it.

reply

upvote

by Gigachad13 hours ago|

[-]

Is deepseek still on subsidized pricing though.

reply

upvote

by 2ndorderthought3 hours ago|

[-]

Near zero probability of that. The model is more efficient and the company who trained it did not blunder trillions of dollars to do so. China has better electricity infrastructure than the US too, so the likelihood they can scale out before the US ever could is high. Long term deepseek, Alibaba, etc hold the most cards for sustainable AI even despite the attempted Nvidia embargo

I am not shilling China, this is just what is happening right now.

reply

upvote

by ConSeannery2 hours ago|

[-]

Lol what? You seriously think that the #1 Chinese AI company is not being subsidized by the Chinese government?

reply

upvote

by 2ndorderthought2 hours ago|

[-]

Which one is #1? Alibaba, deepseek, or moonshot?

I think the Chinese government works differently than the US government. I think China has been subsidizing their electricity grid for decades and leading the world on sustainable electricity namely solar. While the us has let their infrastructure rot and laughed at government inefficiencies for about half that time. The US has data centers running on gas right now while waging wars blowing up gas infrastructure world wide. It would be comical if it wasn't an environmental disaster. Most of them have no hopes at even getting enough power in well established areas short term.

I realize what I am saying may come off as propaganda because the US holds net negative views on China so here are some links.

https://www.technologyreview.com/2025/07/10/1119941/china-en...

https://www.wired.com/story/data-centers-are-driving-a-us-ga...

I think because openai spent so much money upfront showing how it was possible to do this and laid out a product roadmap China got to get on board much cheaper and easier. I see no reason to not believe any of these companies when they say they didn't squander tons of money to do what they did because I don't know how openai has even spent all the money they have it's actually ridiculous to think about.

https://the-decoder.com/openai-adds-111-billion-to-its-cash-...

reply

upvote

by nl11 hours ago|

[-]

Judging by the multiple providers selling it for around the same price (including non-VC funded competitors): no, it isn't subsidized.

reply

upvote

by johndough5 hours ago|

[-]

Is there somewhere I can look up whether a certain provider is VC-funded?

reply

upvote

by 2ndorderthought3 hours ago|

[-]

It's not really about that. China is eating the US's lunch when it comes to ai. Don't get me wrong opus is the strongest model out there today, but that's the us's only advantage right now. Deepseek,qwen,kimi, etc all have fundamental research making the models smaller, more efficient, scalable, etc. in the US the plan is to buy all the hardware, write legislature, embargo other countries, keep models and research closed, so people cannot innovate for the next two to five years.

Unlike the us chinas focus is on research and sustainable building. China also has really good infrastructure for energy, etc. it is also to their advantage to drop 5 billion instead of 2 trillion and beat the us while turning a profit.

Chinas focus in ai is less flashy and because they are the biggest manufacturing super power in the world right now, it directly feeds their economy. They aren't looking for applications or to replace thought workers with slop bots, they have natural needs for this technology. Us manufacturers can't compete so they have to keep companies from selling their goods there see byd. China sees it as commoditizing their complement, the us is risking its entire economy and it's environment and resources, kind of scary.

reply

upvote

by 2ndorderthought3 hours ago|

[-]

Have you seen the news about qwen3.6? People are running it on sub 1000 euro hardware. Apparently it's about as good as Claude sonnet.

reply

upvote

by sergiotapia13 hours ago|

[-]

That is folly because there is very minimal cost to switching providers, let alone models.

reply

upvote

by glicvdfhsdf15 hours ago|

[-]

[dead]

reply

upvote

by bluescrn19 hours ago|

[-]

Did anyone really expect AI to be cheap?

If/when it gets to the point where it can replace a skilled worker, the service can be sold for close to the same price as that skilled labour. But the AI can run 24/7, reliably, and scale up/down at a moments notice.

There's not going to be much competition to drive prices down, the barriers to entry are already huge. There'll likely to be one clear winner, becoming a near-monopoly, or maybe we'll get a duopoly at best.

reply

upvote

by hansmayer17 hours ago|

[-]

> Did anyone really expect AI to be cheap?

Yes, a lot of people (not me). Why? Well because that was the whole value proposition of these companies, relentlessly pushed by their PR and most of the media- rememmber it was something something Pocket PhDs, massive unemployment etc?

reply

upvote

by rwyinuse18 hours ago|

[-]

"There's not going to be much competition to drive prices down, the barriers to entry are already huge. There'll likely to be one clear winner, becoming a near-monopoly, or maybe we'll get a duopoly at best."

Based on what exactly? So far every time OpenAI, Anthropic or whatever has released a new top performing model, competitors have caught up quickly. Open source models have greatly improved as well.

I expect AI to be just like cloud computing in general - AWS, Azure, GCP being the main providers, with dozens of smaller competitors offering similar services as well.

reply

upvote

by 2ndorderthought3 hours ago|

[-]

Right now China is flexing the future in my opinion. Smaller, widely available, frontier models for pennies on the dollar.

I think the future of ai will be breakthroughs that let it run on commodity hardware, and the average person will not be paying for it from the cloud unless they want to be surveilled or are stuck on older hardware.

Right now I am running about what was a frontier model 1-2 years ago on a junk machine. Some people are running what was a frontier model 4 months ago on PCs and laptops that cost 5,000. In a year I think the landscape will be even better.

reply

upvote

by flir18 hours ago|

[-]

I do. "Commoditize your complement". Want to sell lots of silicon? Give away good local models to run on that silicon.

Even if SOTA models in the cloud are a few percentage points better, most work can be routed to local models most of the time. That leaves the cloud providers fighting over the most computationally intensive tasks. In the long term, I think models are going to be local-first.

(Unless providers can figure out a network effect that local models can't replicate).

reply

upvote

by bluescrn5 hours ago|

[-]

> I think models are going to be local-first.

Why on earth would that happen when everything else is moving into the cloud to tie it to ever-escalating subscription fees and prevent piracy?

Even with gaming, where running high-end 3D games in the cloud seems like madness and inevitably degrades the quality of the experience, they won't stop trying.

reply

upvote

by vanviegen17 hours ago|

[-]

> In the long term, I think models are going to be local-first.

Why? There's an inherent efficiency advantage to scale, while the only real advantage for local models (privacy/secrecy) hasn't proven convincing for broader IT either.

reply

upvote

by 2ndorderthought2 hours ago|

[-]

It's foolish not to care about privacy especially as a company. You know how it prevents you from emailing yourself your tax documents? Meanwhile thousands of employees are sending literal design docs, software, product goals, etc to several ai third partys. Not only is that insane, the companies they are sending it too intend too and openly admit to scanning the data, make software products themselves, and intend to create models that can produce their products automatically.

The reason local models hasn't caught on is several fold. It's marketing to say your company follows the latest trend, and there's an inherent pressure to keep AI companies afloat so the economy doesn't entirely collapse. The other is, it wasn't until the last month that these models have caught up to frontier models. They just did, and they are more efficient and don't require a team of 500 to deploy.

reply

upvote

by solid_fuel15 hours ago|

[-]

Local first models aren't just more private than the API vendors, they also have the advantages of fixed cost, lower latency, and better stability - local models don't get nerfed/"updated" in the background like chatgpt does.

Maybe in a world where these AI companies behaved with some semblance of ethics and user-friendliness they would be on even ground, but for anyone paying attention local models are obviously the future.

reply

upvote

by still_grokking11 hours ago|

[-]

> the only real advantage for local models (privacy/secrecy) hasn't proven convincing for broader IT either

Because of nonexistent regulation. Just wait for it…

The legal situation in for example the EU is crystal clear, only that it will take some time to go though all court instances.

reply

upvote

by LtWorf15 hours ago|

[-]

To not depend on an external company that can decide the price.

reply

upvote

by Dylan1680712 hours ago|

[-]

That's a silly reason. For non-agent use cases what kind of utilization are you going to average on your own GPU, 5-10%? And that's without batching.

Even with overhead and scaling for peak use and a large profit margin, any company with an ounce of competition will be vastly cheaper than self-hosting. And for models you can run yourself, there will be plenty of competition.

reply

upvote

by LtWorf8 hours ago|

[-]

I think you are calculating with current prices. Try to extrapolate the price in one year, seeing the current trends instead.

reply

upvote

by vanviegen7 hours ago|

[-]

Extrapolating current trends, I expect API prices to drop significantly for a given measure of 'intelligence'.

reply

upvote

by soraisdead12 hours ago|

[-]

> Did anyone really expect AI to be cheap?

Considering most of the cost of producing a model is the upfront cost rather than the running one, I kinda still do.

The point was never to produce 4 frontier models per company a year.

reply

upvote

by skeeter202020 hours ago|

[-]

"This change aligns Copilot pricing with actual usage and is an important step toward a sustainable, reliable Copilot business and experience for all users."

I see statements like this as strong indicators that the sales people are wrapping up their work and the accountants are taking over. The land rush is switching to an operational efficiency play.

reply

upvote

by torben-friis19 hours ago|

[-]

The sooner the better. Let's take a look at the long term, enshittified, viable product before we get too dependent on the trial version.

reply

upvote

by fsniper19 hours ago|

[-]

And enshitification starts.

reply

upvote

by specproc21 hours ago|

[-]

Yeah, totally. The recent pricing changes have just made my Copilot subscription go from great deal to awful value over night.

I've been wanting to get off MS more generally and this is good motivation. Will be playing round with OR this week.

reply

upvote

by cedws20 hours ago|

[-]

Just be aware OpenRouter charges a 5.5% fee, I didn’t know until recently. I like the product, and I think the fee is fair, but if you want the absolute best pricing then go direct.

reply

upvote

by ffsm819 hours ago|

[-]

But with open router you can always just use the latest model. If you're committed to eg Claude opus then you're better off going directly to anthropic for sure, but if not, varying other models may be fine too, depending on use case and be massively cheaper. Eg new deep seek model with same mio context window or Kimi k2.6 with 270k context window for subagents which implement

reply

upvote

by gruez18 hours ago|

[-]

>but if not, varying other models may be fine too, depending on use case and be massively cheaper

Do inference providers have standardized endpoints, or at least endpoints compatible with claude code? Otherwise to pay 5.5% on all your tokens just so it's slightly easier to swap providers (ie. changing a few urls?)

reply

upvote

by swiftcoder18 hours ago|

[-]

> Do inference providers have standardized endpoints, or at least endpoints compatible with claude code?

Yep, you can plug deepseek/kimi/minimax into claude code just fine. Or run everything through another harness like opencode instead.

reply

upvote

by attentive12 hours ago|

[-]

Or you could use gcp Vertex or aws Bedrock and still have access to a bunch of FMs without a markup.

reply

upvote

by AntiUSAbah19 hours ago|

[-]

Wow thats a lot for routing traffic.

reply

upvote

by sailfast19 hours ago|

[-]

And handling API tokens, and billing, and reliability, and middleware. I am not affiliated with them but it’s not “just” routing.

Apple still charges 30%. 5.5 seems pretty reasonable. /shrug I dunno.

reply

upvote

by Dylan1680713 hours ago|

[-]

> handling API tokens

Don't you still need to handle tokens with them? Also that's trivial.

> billing

Yes but you'd be paying for billing anyway.

> reliability

They increase reliability?

> middleware

Which you wouldn't need if you paid directly.

I'm not saying they shouldn't get 5.5%, but that list is mostly non-convincing.

> Apple still charges 30%.

3 of the 30 is for billing, with the rest mostly being gatekeeping with a fake justification on top.

reply

upvote

by polski-g9 hours ago|

[-]

There's nothing trivial about getting a Google API key. Openrouter removes that stress from my life. And I can route requests to providers above a certain TPS threshold. And much more.

reply

upvote

by ac2919 hours ago|

[-]

Payment processing likely eats up at least 2-3% of that

reply

upvote

by arcanemachiner16 hours ago|

[-]

IIRC OpenRouter charges you for the payment processing fee also.

Still worth it IMO to be able to switch from Provider A to Provider B if Provider A is having a bad day.

reply

upvote

by webworker12 hours ago|

[-]

I will not be renewing/switching over, either.

I had copilot mainly so I could write issues and throw agents at it, while I went off and did other things. Has been great for contained spot work.

At this point, I'll go ahead and leave it expire, and then consolidate between Codex and JetBrains AI. Especially since Xcode supports Codex with a first-party integration.

reply

upvote

by nacs21 hours ago|

[-]

Even Sonnet 4.6 is 9x multiplier (previously 1x)!

The only model I even used on Copilot was Sonnet and now its got a ridiculous multiplier.

At this point they might as well just charge per Million tokens like every other provider instead of having a subscription.

reply

upvote

by krzyk6 hours ago|

[-]

They do for any new plan. Those multipliers are only for people that paid annually. After their subscription ends they'll go into token based pricing like the rest of people.

reply

upvote

by mitjam16 hours ago|

[-]

I understand it like : the 10 usd is for handling the business record, maybe also the harness, I get a few coins to kick tires, but to use it for anything real it’s pay as you go by the tokens list price.

reply

upvote

by altmanaltman20 hours ago|

[-]

> At this point they might as well just charge per Million tokens like every other provider instead of having a subscription.

Pretty sure that's what they will eventually do

reply

upvote

by tjoff18 hours ago|

[-]

... that is exactly what they will do. Just click the link in this thread, or read the headline.

reply

upvote

by hrpnk17 hours ago|

[-]

Why the multipliers then at all?

reply

upvote

by lexone17 hours ago|

[-]

The multipliers are there only for current annual plan customers. After 2026 its all tokens.

reply

upvote

by MattBDev15 hours ago|

[-]

I thought I was smart for buying the annual plan after I graduated and lost my student plan and then GitHub taking away my Copilot Pro I got for free for being a author of a popular OSS project. Turns out I'm being punished for making that year commitment to them. I like to think I'm only a moderate user of GHCP so this is just terrible for me. I'm honestly thinking about cancelling and switching to alternatives while also looking at investing in a local LLM setup.

reply

upvote

by fuglede_8 hours ago|

[-]

So they're changing the product that people already paid an annual subscription for to the worse. That's asking for legal complaints.

reply

upvote

by ItsClo68821 hours ago|

[-]

27x for Opus is genuinely shocking. at that point you're not paying for convenience anymore, you're just paying a GitHub tax. OpenRouter or direct API makes way more sense unless you're really glued to the IDE integration.

reply

upvote

by thrdbndndn21 hours ago|

[-]

I keep seeing people mention OpenRouter.

Does it effectively bypass regional restrictions for you, so you can use something like the Claude API from unsupported regions such as Hong Kong, or does it still enforce the official providers' geo-restrictions?

reply

upvote

by rvnx21 hours ago|

[-]

OpenRouter is great for budget control, but as they are indirect APIs, your experience with cached tokens may vary, eventually costing much more than in direct depending on the providers.

You can pay with crypto though, which seems to be convenient for people under sanctions or with limited access, or if you are in low-tax jurisdiction (e.g. HK)

reply

upvote

by jauntywundrkind20 hours ago|

[-]

Caching is advertised per model+provider.

That said I think few people using openrouter are actually being selective about providers.

It took half a day to get my opencode setup, was not friendly. A lot of manually cross referencing model and providers. I was actually mainly optimizing for relatively fast providers. It all is super fragile and I'm sure half out of date; I have no idea if these picks are still fast, no promises they are still the same price (pretty terrifying honestly).

I'm mostly on coding plans so it doesn't super affect me. But man is it a bother to maintain.

reply

upvote

by johndough18 hours ago|

[-]

It's interesting that the cost multiplier for Claude Sonnet 4/4.5/4.6 varies so much (1/6/9), while the API cost is exactly the same for all three models.

Also, the multiplier of 27 for Claude Opus 4.6/4. is way higher than the increase in API price would suggest.

I wonder why that is.

reply

upvote

by vanviegen17 hours ago|

[-]

On GitHub copilot you pay per prompt. More powerful models can do a lot more work (consuming a lot more tokens) per prompt. Also, they tend to use more thinking tokens.

reply

upvote

by johndough15 hours ago|

[-]

> More powerful models can do a lot more work (consuming a lot more tokens) per prompt.

That is not my experience. Each model since at least GPT-4 can fill up an entire context window. In fact, more powerful models can solve tasks faster, so their ratio of multiplier to API price should decrease, not increase.

For example, Claude Sonnet 4.6 has a multiplier of 9 and an API price of $15, which is 0.6 multiplier per dollar.

Claude Opus 4.7 has an API price of $25, so it should have a multiplier of 25 * 0.6 = 15 when extrapolating from Sonnet, but the multiplier is 27.

> Also, they tend to use more thinking tokens.

That might be it. Is there any data on this somewhere?

reply

upvote

by mullingitover14 hours ago|

[-]

The point of this loss leading is to properly hoover up the money in the pockets of enterprise customers, get them locked into the idea that they need the latest and greatest cloud-based model, while simultaneously starving everyone of the memory they'd need in order to run competent models locally.

In not-too-distant future we're going to be running better models on our phones than we can buy access to today in the cloud. Skate where the puck is going: soak the customers until that day comes.

reply

upvote

by rvnx20 hours ago|

[-]

One theory of the play of SpaceX might do if everyone migrates to query-based billing:

Provide cheap and unlimited access to Grok for programmers (hence the Cursor partnership/purchase for distribution).

-> This would drag massive revenue right before the IPO announcement, like if the company is super growing

-> At a loss, but don't worry, we need these funds to build the biggest datacenter of the universe.

This announcement would create enough momentum to increase valuation, and because of the merge of his companies, would save his X/Twitter investors from a tragedy.

-> Would also be a great service to Cursor investors and so, who are stuck with their VSCode fork

reply

upvote

by AntiUSAbah19 hours ago|

[-]

They probably want the training data. Otherwise these 60B don't make sense at all.

But they can't buy curser before their IPO so thats that?

Perhaps they have to much compute because Musk overpromised and Twittergroq doesn't need that much compute after he nerved the porn stuff?

reply

upvote

by minimaxir20 hours ago|

[-]

It takes longer to build a datacenter with that much capacity than it does for the market to respond.

reply

upvote

by 2ndorderthought2 hours ago|

[-]

Buying real estate in imaginary places is lucrative at first

reply

upvote

by hgoel15 hours ago|

[-]

I think they're going to have to do a lot to overcome the Musk and Grok poison. Even ChatGPT hasn't had as many lapses as Grok has had.

reply

upvote

by 12 hours ago|

[-]

deleted

reply

upvote

by gigiogigione20 hours ago|

[-]

I don’t get the SpaceX reference. I thought they made rockets?

reply

upvote

by victorbjorklund19 hours ago|

[-]

Nobody is paying for Elons xAI so he used SpaceX to buy xAI to fund it.

reply

upvote

by sethops113 hours ago|

[-]

Under the pretense that SpaceX will be used to launch material into space to build space data centers.

reply

upvote

by vizzier19 hours ago|

[-]

They now also own xAI

reply

upvote

by 0xffff218 hours ago|

[-]

Which in turn owns Twitter. SpaceX is now a social media company in addition to a rocket company.

One theory I think Matt Levine posited, is that SpaceX will go public with dual-class stock that gives Elon control even with a minority ownership stake, and will subsequently buy Tesla, which doesn't have dual class stock, making SpaceX the singular "Elon Musk company", with him having operational control despite being public.

reply

upvote

by lioeters16 hours ago|

[-]

That theory aligns with Elon's long-held dream of X as the "Everything Company".

reply

upvote

by Ballas6 hours ago|

[-]

Then he'll rebrand SpaceX to " X" (a space followed by an X).

reply

upvote

by 20 hours ago|

[-]

deleted

reply

upvote

by krzyk6 hours ago|

[-]

Those multiplier are only for grandfathered Pro an Pro+ plans that had annual billing, basically a way to scare people of out of those plans. Ant new ones (and bussiness+enterprise plans) will be on token based billing since June 1.

reply

upvote

by sandos3 hours ago|

[-]

Wow, having a corp. account I do wonder WHEN we are getting some kind of resctriction of usage, or require us to justify our usage.

That GPT4-mini change is going to be brutal! Its much better than 5-mini, which was itself much better than earlier free models.

reply

upvote

by joelthelion7 hours ago|

[-]

Can't wait for people to migrate to open tools (opencode/openrouter). This will unlock a lot of innovation.

(I know openrouter is not open, but it allows competition and should be easily replaceable if needed)

reply

upvote

by youwangd4 hours ago|

[-]

Show HN timing matters more than people think. Monday-Thursday, 9-11am Pacific, is when the front page has the most engaged readers. Weekend posts get less competition but also less engagement.

reply

upvote

by mkhalil17 hours ago|

[-]

Why would folks be better paying 5.5% fee to OpenRouter ("Open") if most people just use one or two providers? Just use the provider's API.

reply

upvote

by djeastm13 hours ago|

[-]

The routing automatically routes you to other inference providers (for the same model) if/when the original provider goes down.

It's a convenience cost, for sure, but it's not valueless in a fast-moving world. Certainly if you're comfortable with one provider and it's cheaper, do that.

reply

upvote

by sally_glance7 hours ago|

[-]

For me the largest value-add is the unified API. Being able to instantly start trialling a new model with zero code changes is well worth 5%. The other part is not having to deal with billing for multiple platforms.

reply

upvote

by minimaxir21 hours ago|

[-]

What's annoying is that it's obvious. In the case of GPT 5.5, if Copilot is going to charge 7.5x what GPT 5.4 costs while OpenAI themselves via the API/Codex only charges 2x of what GPT 5.4 costs, that will immediately raise an eyebrow.

reply

upvote

by boothby20 hours ago|

[-]

To anybody who's been watching the tech sector with a critical eye for pretty much any period from the late 90s and onward, this is just the enshittification process. For most of OpenAI's existence it's been obvious, to me, that investors were burning insane levels of capital to build the market, and now that folks are locked in, you're seeing higher fees, ads, etc. Yet again, the user is the product; the investors want to siphon your data, attention and once you're hooked, money. And for companies like Microsoft and Apple, those hooks can dig deep.

reply

upvote

by Ekaros7 hours ago|

[-]

Let's call it for what it is dumping. Dumping things on market below cost of production. This should not have ever been allowed. RnD costs I can accept somehow. But in this case the interference should have always been billed for the real costs that it took to produce and pay off the capex.

reply

upvote

by Incipient20 hours ago|

[-]

I'd call it a straight up "bait and switch".

reply

upvote

by BearOso19 hours ago|

[-]

If you paid attention to the power requirements and amount of hardware being put into data centers, you should have realized that it cost them an order of magnitude more than you were being charged. To rework your analogy: they hooked you, now they're gonna see if they can reel you in.

reply

upvote

by AntiUSAbah19 hours ago|

[-]

They can only reel you in if its worth it. I still can code.

And while i do not spend 200$ privat, in my startup we discussed this and our current mental model is, that instead of hiring someone new, we prefer to have more money for tokens.

This is easier for us and has a bigger benefit. The cost of a new / first employee is very high, a 200$ subscription is not. Upgrading that to lets say 400 or 800$ is still alot easier and if i can run multiply and better agents with that money, lets goooo.

reply

upvote

by boothby19 hours ago|

[-]

I'm looking at education -- teachers and students, not terribly tech savvy, are being mandated to use these tools. And then comes the rug-pull. It was worth it, but now it's outside of their budget. Poorer schools / students can't stay at the cutting edge; richer schools / students can.

reply

upvote

by AntiUSAbah16 hours ago|

[-]

You still get far with 20$ if you don't use it daily for lots of coding and thinking though.

And Gemma 4 and other open models can easily be hosted even for schools.

reply

upvote

by jochem919 hours ago|

[-]

Oh, I thought it was opium.

reply

upvote

by Gagarin191720 hours ago|

[-]

“Enshitification” is just when unsustainable subsidies end?

Another reason to hate that word.

From a different perspective, you were granted an incredible gift from the companies who let you use their product on their dime. Hopefully you made the most of it when you had the opportunity.

reply

upvote

by boothby18 hours ago|

[-]

No, it's much more than that. It starts with unsustainable subsidies, as Uber undermined the taxi industry with a ludicrous burn rate. And then, once everybody's hooked to the point that they can't imagine life without the product, you raise costs. And you iterate: raising costs, lowering quality, selling data, increasing addictiveness. Until everybody wants to get rid of it, hates every aspect of it, but is still hooked to the core product. I'm personally not using these tools, not using uber or Meta products. But I'm still using some Google products and it's hard to extricate them from my life now that I'm using them.

reply

upvote

by Gagarin191713 hours ago|

[-]

> No, it's much more than that.

Okay then this AI stuff isn’t an example of that even under your definition.

reply

upvote

by itemize12310 hours ago|

[-]

unless the 5.4 price is a huge loss leader for them

reply

upvote

by recitedropper19 hours ago|

[-]

Everyone seems to believe OpenRouter isn't subsidizing but, until they publish audited financials, I personally doubt it.

reply

upvote

by deaux18 hours ago|

[-]

OpenRouter doesn't even have hardware. What are they possibly subsidizing? The platform costs?

OpenRouter is guaranteed to be about the highest margin operator in the business right now. Everyone wishes they'd be them, skimming 5% off as the middleman without any OpEx.

reply

upvote

by ValentineC15 hours ago|

[-]

> OpenRouter is guaranteed to be about the highest margin operator in the business right now. Everyone wishes they'd be them, skimming 5% off as the middleman without any OpEx.

The 5% fee probably has to factor in Stripe's fees, which would be around 3% to 4% depending on whether it's an international card.

reply

upvote

by recitedropper18 hours ago|

[-]

Streaming, caching, and tool calling can get pretty expensive with scale, even when you don't touch inference. Maybe they're doing something clever and are quite profitable.. or maybe they've already taken $40mm from VCs and are currently trying to raise $120mm at a 1.3B evaluation.

They also show headline prices for the cheapest provider of whatever model, but then need to hit different backends some of which may be more expensive. For now they absorb those costs, but the VCs always come knocking.

Just my opinion though. Totally agreed that they have one of the best positions amongst all AI providers from a financial standpoint.

reply

upvote

by vanviegen17 hours ago|

[-]

> They also show headline prices for the cheapest provider of whatever model, but then need to hit different backends some of which may be more expensive. For now they absorb those costs, [..]

They do?? I was under the impression I was just playing the price for whatever provider they deemed 'best' for each completion.

reply

upvote

by recitedropper17 hours ago|

[-]

That is what I had heard.

Checking now: The way they describe it in their FAQ is that if the price changes, then they will bill you the new price. But I read that as regarding if the primary model provider changes their headline token cost; not in the case of pricing differences for models that have many different backends that host them.

Regardless, I would be more concerned about the streaming costs if the service continues to blow up and they scale aggressively through VC investments. If their 5.5% skim accounted for what they needed, you'd think they could effectively grow organically..

reply

upvote

by Mattwmaster5820 hours ago|

[-]

FYI, these are the multipliers for annual plan. I would hazard a guess most people are not on an annual plan

reply

upvote

by mitjam15 hours ago|

[-]

I am and I see it as stopping the music at a party when you want everyone to go home without telling them to go home. There is also the offer to quit with prorated refund for the remaining time. I think I am going to take it.

reply

upvote

by themafia16 hours ago|

[-]

We can't even get slop delivery worked out. So use SlopAggregator instead.

reply

upvote

by 2ndorderthought15 hours ago|

[-]

I don't know if it's just me but copilot kind of sucks. I've been running local models with like 9b parameters and they are about as good if not better. Obviously there's no integrations or whatever and I get most people are probably paying for that than anything else but eh. Big no thanks from me.

reply

upvote

by whateveracct20 hours ago|

[-]

"eras" tend to not be so short lol

reply

upvote

by siva720 hours ago|

[-]

That's so unfair to us hard working developers. A month ago i could buy for .4$ a turn with Sonnet. Now i have to pay at least .9$ for this turn. Weeks ago i could buy for .12$ an Opus turn after they already raised prices and now they want .27$ from me for the same product! They are stealing from us!

reply

upvote

by asdfasgasdgasdg19 hours ago|

[-]

They aren't stealing from us, for several reasons. First of all, it's a voluntary transaction. If you don't like the prices, use something else. Or don't use AI at all.

Second, you have no idea what their costs are. It is most likely that they are simply passing on their costs to you. If that was not the setup, users would just go to another service provider who was providing tokens at a cheaper rate. It's not like there is a dearth of competitors in this business.

reply

upvote

by croes19 hours ago|

[-]

The already stole when they trained their models on the data.

Now they just increase the price to buy it back

reply