undefined

points

[-]

> So there's a huge number of HN posters claiming that the price of tokens will go UP over time rather than down (that's how Moore's Law works, right???)

I mean, Github Copilot's pricing just went up considerably, so I guess they were right?

by dofm21 hours ago|

prev|

[-]

I don't think it is unreasonable to say both will happen, is it?

In the long term, tokens will fall in price. Obviously. (If "tokens" continues to be the unit)

In the short to medium term, for the IPOs to succeed, people have to start actually paying for what they are using, so the price will go up, and is going up, quite a lot. Once their value is set they will slowly fall from that point (or some point maybe halfway, depending on how much the market is willing to continue to subsidise).

I am an AI cynic, but I am now an informed cynic; I am learning agentic tools so I know where they are useful and I know my enemy.

I think the "fad" here is cloud-based, metered AI being a dominant work mode.

Nothing, so far, has suggested to me that any other outcome is likely than edge- to local-scale, on-device, on-laptop, on-prem models getting good enough to the point where people use them by default and use the cloud models only when they need the extra oomph.

I cannot believe that there is anything other than an enormous incentive for companies like Uber to find local, small model and on-premises solutions to their problems, not least while pricing is so changeable and people are getting nasty surprises.

Betting on OpenAI and Anthropic being around over the long term in the form that they are now, that feels like valley hopium. Utility monopolies essentially always derive from physical/geograpical limitations, don't they?

by jujube318 hours ago|

parent|

[-]

I mean, there's an "enormous incentive" for people to run their own data centers rather than using AWS. And yet, cloud is growing and on-premise is shrinking.

While I hope local AI continues to exist, I'm skeptical that it will take over, for the same reason running your own servers hasn't taken over. It's just hard, and involves spending huge sums of money up front.

It's also not really clear how much tokens are being subsidized. The discussion reminds me of Uber. For years people on HN claimed that Uber was going to collapse once they ran out of VC money. Then... that never happened, and everyone just moved on to discussing other things.

by oblio11 hours ago|

parent|

[-]

Infrastructure is massively complex and multi cloud is super hard to do. Switching LLMs is... a drop down.

Now, that doesn't mean running your own LLM will be easy, but this will mean it's a lot more likely that there will be at least regional LLMs, in my opinion. I.e. there will be Google, whichever (if any) is left standing of OpenAI or Anthropic, and then there will be Chinese hosted LLMs, probably Indian hosted LLMs, European hosted LLMs, plus LLMs hosted on managed services (i.e. Bedrock). For sure I see large banks on the like being able to host the best OSS or even licensed LLMs on their own cloud infrastructure accounts (i.e. at AWS, Azure, etc).

And that's on top of the LLMs running on owned server infrastructure plus actual local, on device LLMs.

by jujube33 hours ago|

parent|

[-]

You're using the future tense, but all of those things already exist. Google exists, Amazon Bedrock exists, DeepSeek's cloud product exists, etc. etc. But this isn't relevant to what the post you are replying to said, which is that "cloud-based, metered AI being a dominant work mode [is a] fad". Since all of those things are cloud-based, metered AI.

by dofm2 hours ago|

parent|

[-]

I was talking more about on-premises, on private cloud and on-device stuff, as I said.

If you look at what Uber is spending per developer per month, they clearly have some headroom to consider whether more-local, unmetered AI tools on device, on premises, in private cloud, can be cost-effectively used to cut down how much money they are pouring into Anthropic and OpenAI. Not least because a bit of centralised effort might lead them to distilled models that are better for their purposes. Some of that budget could go into simply putting a bit more capacity on a developer's desk.

Can they do it now for everything? Obviously not. But IMO there is no reason at all for planning and scaffolding tasks to be done with cloud models, and there are many reasons why it might be better to do document processing without leaving the premises.

The incentives are there on the technical, operations and particularly on the business levels, and the relative disruption of the switch really small, considering that all the tooling can use different models for different tasks already. They must at least be investigating the possibility; it's irresponsible not to.

by Der_Einzige18 hours ago|

prev|

[-]

Token costs do go down over time for sure due to software optimizations (i.e. better attention kernals) but acting like hardware INFLATION isn't happening for at least a few more years is just nonsense. Objectively an A100 is more expensive to rent today than it was in 2024 (a 7 year old GPU - Big short guy is a turbo idiot) and rising. As such, over short time horizons, it's possible to see limited amounts of "price per token goes up" for the same model.

by oblio11 hours ago|

parent|

[-]

It's a mix. If the current wave of LLM businesses crater, demand for LLM specific hardware (and related hardware) will crater. GPUs were propped up by crypto currencies and now by LLMs. They're still great at doing fundamental math operations, but for their value to stay up another massive business opportunity involving matrix multiplication and the like would need to rise as soon as the current business cycle winds down.

Not impossible, not unlikely, probably 50-50.