Users really don’t matter at all. The revenue for AI companies will be B2B where the user is not the customer - including coding agents. Most people don’t even use computers as their primary “computing device” and most people are buying crappy low end Android phones - no I’m not saying all Android phones are crappy. But that’s what most people are buying with the average selling price of an Android phone being $300.
I worked for a research focused AI startup that had a strict "no external LLM" policy for code touching our core research.
You're right that the average consumer doesn't care about privacy, but there are many, many users who do. The average consumer also don't have a desktop with GPU or high end Mac Studio, but that doesn't mean there aren't many people working with AI how do have these things.
If we continue to see improvements in running local models, and RAM prices continue to fall as they have in the last month, then suddenly you don't have to worry about token counts any more and can be much more trusting of your agents since they are fully under your control.
Yeah but if they can rake in 100x as much by making products for people who don't care about privacy, then why spend time developing stuff for people who care?
There is still a small market left, of course, but that market will not have the billions of R&D behind it.
People have said this since Pytorch was published and it's not any more true now than it was 10 years ago.
Every company has dozens of SaaS products that store their business critical information. Amazon installs Office on each computer, Slack (they were moving away from Chime when I left), and the sales department uses SalesForce - SA’s and Professional Services (former employee).
The addressable market of even companies that care about privacy is not a large addressable market. How long will it be before computers become cheap enough that can run even GPT 4 level LLMs that companies will give it to all of their developers?
They aren’t buying high end $2000+ Mac Minis.
They are. The majority aren't doing inference on a Mac Mini, but instead using it as a local host for cloud-based inference. You could have the same general experience on a $200 Chromebook or $300 Windows box.
The world is not moving back to on prem.
As someone who has hardware in that price range and plays with local LLMs: The gap between Opus or GPT and the local models is still very large for work beyond simple queries.
Self-hosted also starts making my office hot due to all of the power consumption when I use it for anything more than short queries. If you haven't heard your Mac's fans spin up much yet, running local LLMs will get you acquainted with the sound of their cooling systems at full blast.
Lol, you should tell my customers (that are moving back on prem) that!
You should also tell Microsoft, who just yesterday said they are going back to focusing on local apps.
You have data showing growth in cloud, which I expect and don't disagree with. The data I come across shows this too!
What I disagree with, from my own experiences and all the data I can seem to find online is that the growth rate in repatriation is MUCH higher than the growth in cloud.
It has flipped over the last 3yr.
US Enterprises, Fortune 100, especially. Also a lot of public entities (gov).
"In 2025, repatriation is still generally an upward trend. Data from the end of 2024 showed that 86% of CIOs planned to move some public cloud workloads back to private cloud or on-premises — the highest on record for the Barclays CIO Survey."
"Real examples of cloud repatriation include Dropbox, Adobe, and GEICO. All three companies moved a significant portion of their infrastructure onto public cloud before moving it to a combination of on-premises and hybrid cloud providers."
Noted: SaaS accounts for 46.10% of market revenue, while PaaS is the fastest-growing segment at 21.35% CAGR
Also when I searched for your quotation the very next paragraph was
“ This trend does not represent a rejection of cloud computing. Organizations continue investing heavily in cloud services, with Gartner forecasting that global cloud spending will reach approximately $723 billion by the end of 2025.”
Care at all sure, but enough to make a difference, the history of the web and recent computing history indicates otherwise.
No? What? Oh, you can't?
Neither can consumers. Most consumers are very aware of the lack of privacy, the manipulation, and have very cynical feelings about Facebook and similar companies. But it's where their friends and family are.
For most people the web is a mine field maze where basic things they want are compromised everywhere. And they are routinely creeped out by ads that reveal they know them far too personally.
You are mistaking network capture for preference.
Another telling example. Lots of privacy valuing technical people, who would never have a Facebook account, send unencrypted text emails.
It is network capture, not preference.
They are choosing to give Facebook info.
Yes, they do. That's is exactly the phenomena my comment addressed.
But the way you wrote that implies an improbable motivation or choice framing.
Perhaps their real motive/choice is to share with other people on the site.
It is called a network effect.
If (1) Facebook had been the surveillance/manipulation capital of the world from inception, (2) an equally inviting privacy protecting site took off at the same time, and (3) everyone chose Facebook over E2EE anyway, then sure, we could throw up our hands! Those silly users!
The term I have for when people discuss choices involving many-dimensional criteria, as if the choice involved just one or two selected dimensions, is "dimension blindness". It happens in a lot of heated discussions about phone choices too.
This is clearly true. There is an implied point here but I am not sure what.
They share in their profile what they want other people to see. And often choose to not fill out everything. Nobody signs up to share with Meta, Inc.
Most people would love a "[ ] Do not share with Facebook".
People choosing an imperfect option, from imperfect options, are not demonstrating evidence they don't care about the imperfections.
> Would the button disable them from checking in and updating their profile?
No.
Oh an wait till ad companies start selling your healthcare data and you will see how fast things turn 'given a choice'.
This might be the cost of privacy, and it might be worth paying, unless cloud models reach an inflection point that make local models archaic.
I would expect consumer inference ASIC chips will emerge when model developments start plateauing, and "baking" a highly capable and dense model to a chip makes economic sense.
I could be wrong because I'm not following this too closely, but the open weights future of both Llama and Qwen looks tenuous to me. Yes, there are others, but I don't understand the business model.
Cost is a pretty big reason.
Why waste time with subpar AI?
Actual consumers not only don't care, they will not even be aware of the difference.
What are you doing with these local models that run at x tokens/sec.
Do you have the equivalent of ChatGPT running entirely locally? What do you do with it? Why? I honestly don’t understand the point or use case.
2. They aren't harvesting your data for government files or training purposes
3. They won't be altered overnight to push advertising or a political agenda
4. They won't have their pricing raised at will
5. They won't disappear as soon as their host wants you to switch
What are you doing with it?
Why do you want it?
None of them are as good as the big hosted models, but you might be surprised at how capable they are. I like running things locally when I can, and I also like not worrying about accidentally burning through tokens.
I think the future is multiple locally run models that call out to hosted models when necessary. I can imagine every device coming with a base model and using loras to learn about the users needs. With companies and maybe even households having their own shared models that do heavier lifting. while companies like openai and anhtropic continue to host the most powerful and expensive options.
I still don’t understand. What are you using this long you’re running locally to actually do?
What is the use case?