I mean, Github Copilot's pricing just went up considerably, so I guess they were right?
In the long term, tokens will fall in price. Obviously. (If "tokens" continues to be the unit)
In the short to medium term, for the IPOs to succeed, people have to start actually paying for what they are using, so the price will go up, and is going up, quite a lot. Once their value is set they will slowly fall from that point (or some point maybe halfway, depending on how much the market is willing to continue to subsidise).
I am an AI cynic, but I am now an informed cynic; I am learning agentic tools so I know where they are useful and I know my enemy.
I think the "fad" here is cloud-based, metered AI being a dominant work mode.
Nothing, so far, has suggested to me that any other outcome is likely than edge- to local-scale, on-device, on-laptop, on-prem models getting good enough to the point where people use them by default and use the cloud models only when they need the extra oomph.
I cannot believe that there is anything other than an enormous incentive for companies like Uber to find local, small model and on-premises solutions to their problems, not least while pricing is so changeable and people are getting nasty surprises.
Betting on OpenAI and Anthropic being around over the long term in the form that they are now, that feels like valley hopium. Utility monopolies essentially always derive from physical/geograpical limitations, don't they?
While I hope local AI continues to exist, I'm skeptical that it will take over, for the same reason running your own servers hasn't taken over. It's just hard, and involves spending huge sums of money up front.
It's also not really clear how much tokens are being subsidized. The discussion reminds me of Uber. For years people on HN claimed that Uber was going to collapse once they ran out of VC money. Then... that never happened, and everyone just moved on to discussing other things.
Now, that doesn't mean running your own LLM will be easy, but this will mean it's a lot more likely that there will be at least regional LLMs, in my opinion. I.e. there will be Google, whichever (if any) is left standing of OpenAI or Anthropic, and then there will be Chinese hosted LLMs, probably Indian hosted LLMs, European hosted LLMs, plus LLMs hosted on managed services (i.e. Bedrock). For sure I see large banks on the like being able to host the best OSS or even licensed LLMs on their own cloud infrastructure accounts (i.e. at AWS, Azure, etc).
And that's on top of the LLMs running on owned server infrastructure plus actual local, on device LLMs.
If you look at what Uber is spending per developer per month, they clearly have some headroom to consider whether more-local, unmetered AI tools on device, on premises, in private cloud, can be cost-effectively used to cut down how much money they are pouring into Anthropic and OpenAI. Not least because a bit of centralised effort might lead them to distilled models that are better for their purposes. Some of that budget could go into simply putting a bit more capacity on a developer's desk.
Can they do it now for everything? Obviously not. But IMO there is no reason at all for planning and scaffolding tasks to be done with cloud models, and there are many reasons why it might be better to do document processing without leaving the premises.
The incentives are there on the technical, operations and particularly on the business levels, and the relative disruption of the switch really small, considering that all the tooling can use different models for different tasks already. They must at least be investigating the possibility; it's irresponsible not to.
Not impossible, not unlikely, probably 50-50.