upvote
This doesn’t solve the problem because (tautologically) the more AI prices go down the less money the companies make. If right now today the companies are operating at a profit and a price war causes the API costs to sink 90% next year, and their capex amortization costs stay fixed.

The math doesn’t math.

reply
AI prices going down means the models are improving, particularly from the efficiency angle (which is inevitable, given the nature of tech). That means all they have to do is maintain a large enough customer base at a rate high enough to ensure loss decreases continuously over time, until eventually the pass the point where they're just gaining. Healthy competition ensures that improvement savings are actually passed on to users in a measured manner, so they don't become too greedy in trying to get to and increase gains.
reply
But now you’re describing a commodity, and the competition will erode profits, and their valuations are bananas, unless someone can find a business model that truly differentiate and creates a moat.
reply
Models are not commodities and are famously non fungible. Each model has its quirks and strengths, weaknesses and idiosyncrasies.

I know because I see how people went over the 4o model. I can see opus behaving clearly differently enough that I pick it for certain tasks.

reply
Is this really for comparable models though? Will folks at scale continue to choose Anthropic frontier AI model if OpenAI releases a similar generation at a 90% discount with comparable capabilities? It feels like the fungibility assumes delineation by capability _and_ cost. No one is choosing sonnet over opus at similar price points.
reply
This doesn't really tell you anything useful. AI companies have both built huge datacenters and raised a colossal amount of money. Include caching, quantization and etc. All of those would allow them to undercut on price considerably, even more so if you count in all the users who don't actually cap out their plans. Prices going down doesn't really tell you anything about the production cost, especially in a market where every major participant is happy to burn money just for the marketshare.
reply
There are many research avenues which are open which reduces cost dramatically. Smaller task specific/ language specific/ domain specific models, in fact they could even be better. The earlier computers were the size of a building. So prediction based on current state into the unknown future possibilites is wrong. The hardware will be all the more valuable if cheaper ways to run become possible. The hardware gets cornered in a sense.
reply
Because of it's unpredictability and massive dependence on the training data, when LLMs start hallucinating most of the time the only fix these "engineers" have is to feed it another LLM... The genius was the transformer architecture, and evidently none of us have a damn clue how it works
reply
Every 6-12 months or so we get an increase in one or more of things like: compute power, compute efficiency, GPU power, GPU efficiency, network bandwidth increase, memory speed increase, component density increase in the same form factor, etc.

For awhile it was every 2-3 years you'd start a hardware refresh. As companies moved into more and more training, this timeframe started to shrink. It went from 36 months to 24 months. From 24 months to around 16-18 months. Last I checked last year, it was at 12 months. I think things may have slowed because of component availability, but otherwise whole data centers would be 6-12 months into full operations before they would start a refresh cycle.

Not to mention the massive increase in power density demand and cooling demand per rack that entails.

So no, "AI costs" have not gone down, in fact they are more expensive on training AND inference than ever.

This is why many are concerned about the heroin drip of api costs into orgs. For the companies that are public, look into their financials. It's gonna hit companies and high volume users like a ton of bricks.

reply
I'm no economist but if true don't you have the opposite problem? How do you get people to need X many tokens per day such that you can sell enough to make money? Wouldn't you need an absence of competition for that to be ok?
reply
the demand for intelligence is infinite. you sound like someone in 1960 wondering what the hell we would even do with the functionally infinite cpu cycles we have available to us now.
reply
If you are an AI bear you have multiple techniques with you

- if AI costs go down you can ask how the companies will make profit and then suggest the bubble popping

- if AI costs go up you can ask how people will afford it and then suggest the bubble popping

- if companies actually do make profit then you can say the companies are getting too big and powerful so it’s a bad thing for consumers

Essentially you have left zero to a small narrow path where you are happy with the outcomes.

reply
I get your point but it still like begs the question right? If you are optimistic about it all, what is the good narrative? What does it all look like? Billions upon billions of prompts to the finest models every millisecond? And then we have to scale on top of that? To like what end? How many apps do we need to code? How many questions can a single person even ask on any given day? Do I lack some imagination here?

Like what if they don't necessarily have to be super duper money making machines to legitimate how useful and nice they are for you? Is that even conceivable? What if tomorrow we all decided they are more like utilities? Would that change anything intrinsic about them for you?

reply
Can you cite a source? Everything I've read describes the costing as linear with growth.
reply
The quality of what you can get from DeepSeek V4 Pro for $10 is light years ahead of what you could get for $20 a year ago.

Likewise, the quality of what I can get from a local model like Qwen 3.6 on an RTX 5090 is light years ahead of what I could get a year ago on the same hardware.

reply
reply
That article seems a bit bogus. Cost per capability is a soft, non-predictive model unlike cost per token which has been trending up.
reply
This is just hand waving on the obvious consensus that cost per capability is going down. There’s no doubt about it. Hell you can run a Gemma 4 model on your laptop that mogs GPT 4. But yeah you can use fuzziness as an excuse and ignore the trend.
reply
What?
reply
He's saying output for 1M tokens on the latest models is $50 now when it used to be $2500.
reply
so how are these labs going to recoup the insane training costs at those prices? even if there is still a fat margin leftover afterwards
reply
They also have to continuously train, forever, to avoid model drift. It's not a one and done thing as far as I'm aware.
reply