upvote
Infrastructure is massively complex and multi cloud is super hard to do. Switching LLMs is... a drop down.

Now, that doesn't mean running your own LLM will be easy, but this will mean it's a lot more likely that there will be at least regional LLMs, in my opinion. I.e. there will be Google, whichever (if any) is left standing of OpenAI or Anthropic, and then there will be Chinese hosted LLMs, probably Indian hosted LLMs, European hosted LLMs, plus LLMs hosted on managed services (i.e. Bedrock). For sure I see large banks on the like being able to host the best OSS or even licensed LLMs on their own cloud infrastructure accounts (i.e. at AWS, Azure, etc).

And that's on top of the LLMs running on owned server infrastructure plus actual local, on device LLMs.

reply
You're using the future tense, but all of those things already exist. Google exists, Amazon Bedrock exists, DeepSeek's cloud product exists, etc. etc. But this isn't relevant to what the post you are replying to said, which is that "cloud-based, metered AI being a dominant work mode [is a] fad". Since all of those things are cloud-based, metered AI.
reply
I was talking more about on-premises, on private cloud and on-device stuff, as I said.

If you look at what Uber is spending per developer per month, they clearly have some headroom to consider whether more-local, unmetered AI tools on device, on premises, in private cloud, can be cost-effectively used to cut down how much money they are pouring into Anthropic and OpenAI. Not least because a bit of centralised effort might lead them to distilled models that are better for their purposes. Some of that budget could go into simply putting a bit more capacity on a developer's desk.

Can they do it now for everything? Obviously not. But IMO there is no reason at all for planning and scaffolding tasks to be done with cloud models, and there are many reasons why it might be better to do document processing without leaving the premises.

The incentives are there on the technical, operations and particularly on the business levels, and the relative disruption of the switch really small, considering that all the tooling can use different models for different tasks already. They must at least be investigating the possibility; it's irresponsible not to.

reply