For a long time, too. Programming languages rarely change much, techniques rarely change, so I should be able to use said model for I hope at least five years; and if at any time they optimize local models to cram even more intelligence into the same amount of VRAM, I can upgrade to that.
I like this path.
I experiment with all of the local models I can fit into 32GB of VRAM and I have subscriptions to multiple SOTA providers.
The difference between them is very large, unfortunately. The local models can handle small tasks and refactoring mostly okay, but doing anything challenging with them becomes a waste of time. Unfortunately the waste isn’t immediately obvious because they will come back with something that looks like it works, but then on closer examination I need to throw it out and reset them in a usable direction.
I have seen the results of some early attempts. It fails in such hilarious ways that all these companies are scared of productizing it. But once someone does it, the taboo is broken and everyone else will follow suit immediately.
I don’t know OpenAI’s infra, but to the extent they are buying GPUs and building data centers with their own money, that sounds like a bad move.
Satya has mismanaged the AI transition in many ways, but one thing he got right is that models are commodities, and the value is in applications that apply them to create user benefit. I agree that any company trying to build a moat with a model is not long for this world.
This is also why the money being poured into datacenters isn't going to result in as much development as you think. It's about leveraging other people's money to lockdown more future hardware. This is going to end exactly like fiber build out in the 2000s. Eventually that fiber got used but the folks who originally paid for it got hosed.
Paradoxically, the better results we get from general harness of coding agents, the less moat Claude and co. get. It's unbelievably how fast some open models outpaced frontier models of just a few months ago.
In my opinion, the bottleneck is the package management layer and not the model capabilities and performance.
I have been an avid Linux user for decades, and if I find it confusing and painful, something is missing.
Sadly - it's going to be ads. Advertising is going to get in there and enshittify the whole thing because as always, advertising income is too easy and too plentiful for any company to resist.
Right now the models are fairly agnostic, but we are a hair-breadth away from ChatGPT responding with, "the right tool for this job is a circular saw - something like the Milwaulkee M18, which happens to be on sale at Home Depot this weekend."
Enough to validate repurposing an existing workstation with enough RAM, or finding a used high VRAM GPU, or in my case buying a Strix Halo system for home lab and local models.
The future is once again not cloud based, for AI tools.
It makes sense to show some ads and get some money at low volume (like a faraway reader wanting to read a story in your local newspaper) but taking money from regular users directly will pay much more.
Newspapers are happy to cannibalize 99% of their ad revenue with a paywall if that 1% subscribes because that’s how much more money you make from someone paying $10-$20/month vs ads.
But yeah, if people use it as a buying recommendation engine, that’s where the money is on ads/referrals but a lot of AI use has little/no connection to buying intent touchpoints.
LLMs may or may not be able to cover their costs with it. We'll see - I suspect product placement as recommendations will become a thing as it won't take as much GPU to give a "recommendation" on "the best widget for X". I firmly expect it to become enshittified the same way google and amazon search has.
And that's if LLMs don't become commodified.
Now it's compliant with the law.
I run my word processing software on my apple 2 (a total joke of a computer) instead of running it on the WANG.
I run my book keeping software on visicalc instead of the IBM.
I run my simulation software on my IBM PC (I even paid for the 8087!) instead of the VAX.
Moore's law has, at least so far, allowed the pioneers with toy computers to grow their toys big enough to solve "big boy" problems after some time has allowed the toy computers to be faster and the pioneers have scaled their crappy home-grown solution to solve their 60% of the problem that was originally solved by some enormous complex system.
Eventually the toy infrastructure gets expensive and solves 90-120% of the "big iron" problem space, but it also grows to cost as much as the big iron solution, but then a new generation of toy software and toy systems emerges to disrupt the "big iron" systems.
See also http://www.catb.org/jargon/html/W/wheel-of-reincarnation.htm...
If a vendor can SaaS a solution, then enterprise is generally happy (they don't want to have to hire folks for maintenance), and that completely locks out any ability to run locally.
Between enterprise's ambivalence and the obvious financial incentive to vendors, you get SaaS-only products.
Make the local AI competent enough to do good image generation and editing, realtime voice and music generation, handle agentic tasks with a framework like Hermes, and you can take your AI places to do tasks in contexts that are inaccessible to or inappropriate for cloud.
Frontier big platform models will be the best, but there's a level of "good enough" for local uses that we're already seeing flourish, and "good enough" for the average joe is almost here.
People -- WANT -- this technology on their home devices and (apparently?) the providers of this tech don't seem to be running a profit so they probably don't want the maintenance tail on their side either.
I think it's a bit different. Inevitable that this becomes a household-run thing? Not likely.
But my downtimes are a bit self-inflicted: changing ISPs which I can personally workaround but harder for a blog where one expects uptime.
The primary feature of "AI" is to process information and reason with a natural language interface at speed, the primary feature of AI bigboys is to provide the machinery that runs the "models".
See the difference?
Hosting a blog 24x7 on a laptop is trivial, except for hyperscaling to the front page of HN and Reddit.
I think you've misunderstood what good enough means in the context - which is a model capable of completing the tasks assigned to it without having the breadth of full generalization. Your analogy breaks down because of this - we did get 'good enough' spec profiles for different hardware. That thing you're wearing on your wrist won't have the same specifications as the box you use to play games.
> a model capable of completing the tasks assigned to it
The thing is, the "task assigned to it" is changing with improved capabilities. If everyone around you in 2036 is using general AI to do amazing stuff, you will probably have little interest in vibe coding slop like it's 2026.
Only if you give in to fads and FOMO.
The core tasks people need change at a much smaller pace.
That's correct. The problem is they have smart people, tons of money, and several years to figure that out, and the best thing they can come up is a coding agent.
The ‘best’ things are; - fuzzy pattern matching algorithms for traffic analysis, human and other image target recognition.
- targeting algorithms that identify ‘suspicious’ individuals in large volumes of metadata.
- fraud analysis
- antagonistic image and video generation, both for fooling other fraud analysis, but also for propaganda, screwing with other actors, etc.
- directed high speed content generation (text, pictures, video) to spam the ‘algorithm’ and allow near realtime identification of additional buttons to push for given target audiences.
- massive marketing/ad manipulation.
Those budget line items (and the suppliers) really want to stay off the radar however, as it makes their life harder.
That would be the dream... no fucking Electron! No lockdown modules.