undefined

points

[-]

Happened to me. CoPilot changing prices prompted me to cancel my CoPilot subscription and install a local coding model running entirely in VRAM. Will call Claude APIs when I get really stuck, but I should be able to handle 80% of my needs with a dumber local model.

For a long time, too. Programming languages rarely change much, techniques rarely change, so I should be able to use said model for I hope at least five years; and if at any time they optimize local models to cram even more intelligence into the same amount of VRAM, I can upgrade to that.

I like this path.

by Aurornis2 hours ago|

parent|

[-]

> Will call Claude APIs when I get really stuck, but I should be able to handle 80% of my needs with a dumber local model.

I experiment with all of the local models I can fit into 32GB of VRAM and I have subscriptions to multiple SOTA providers.

The difference between them is very large, unfortunately. The local models can handle small tasks and refactoring mostly okay, but doing anything challenging with them becomes a waste of time. Unfortunately the waste isn’t immediately obvious because they will come back with something that looks like it works, but then on closer examination I need to throw it out and reset them in a usable direction.

by 1 hours ago|

parent|

prev|

[-]

deleted

by Npovview28 minutes ago|

prev|

[-]

More likely we will have a compute device like NAS or something which will run one good model locally for all the house members just like we have one wifi router in every house. Nvidia can invest in building such a device as well as the models and make money on the hardware.

by PLenz5 hours ago|

prev|

[-]

This. OpenAI and Anthropic are ultimately compute infrastructure plays and not really AI. Everyone will have models, they'll have the ability to run them. This is why the GPU shortage is in their favor.

by ryandvm4 hours ago|

parent|

[-]

And like Google and Meta, these companies are going to morph into advertising giants. Advertising is an economic black hole and it eats everything that comes close.

by fooker2 hours ago|

parent|

[-]

Embedding ads in LLM responses is something researchers are having a lot of trouble figuring out right now.

I have seen the results of some early attempts. It fails in such hilarious ways that all these companies are scared of productizing it. But once someone does it, the taboo is broken and everyone else will follow suit immediately.

by jaimie1 hours ago|

parent|

[-]

It's already being done: https://openai.com/index/testing-ads-in-chatgpt/

by brookst4 hours ago|

parent|

prev|

[-]

How does that view align with Anthropic leasing data centers from others?

I don’t know OpenAI’s infra, but to the extent they are buying GPUs and building data centers with their own money, that sounds like a bad move.

Satya has mismanaged the AI transition in many ways, but one thing he got right is that models are commodities, and the value is in applications that apply them to create user benefit. I agree that any company trying to build a moat with a model is not long for this world.

by cmiles83 hours ago|

parent|

[-]

Then they go bankrupt.

by butokai5 hours ago|

parent|

prev|

[-]

Do you think there will still be an incentive to release weights in that scenario? Everyone will have models only if there continue to be companies releasing weights.

by PLenz5 hours ago|

parent|

[-]

Companies won't but I suspect this is a role that something else open source-y will fill that niche. Maybe orgs like wikimedia or internet archive, maybe some hackers just making things, maybe nation states that want to disrupt other players. Also model training will get better and better both on the algo and the hardware side. You can easily see a world where you might be able to train a good enough model on a home lab in a few days.

by rmoriz2 hours ago|

parent|

[-]

But you will need training data. Like a whole Internet search engine or massive data scraping. That‘s a thing that will not change with better algorithms, hardware or cheaper energy.

by PLenz2 hours ago|

parent|

[-]

Data is the only moat but they'll be starting in the same place the current set of players statyed out just a few years ago. I suspect that the delta between what is publicly available (if not legally publicly available! see scihub) and what open ai and anthropic have is relatively small.

by aorloff2 hours ago|

parent|

prev|

[-]

Maybe. But if we can all run our own model locally in 2 years on commodity hardware OpenAI and Anthropic will start to look like WeWork during the pandemic

by PLenz2 hours ago|

parent|

[-]

I agree with you that they are headed in that direction! The GPU shortage is (I think) similar to the pandemic era hiring binge. It's less about the extra compute and more about denying the GPUs to potential competitors. They're racing against time to find something that gives them real moat (gen ai I guess?) and they are trading money for time.

This is also why the money being poured into datacenters isn't going to result in as much development as you think. It's about leveraging other people's money to lockdown more future hardware. This is going to end exactly like fiber build out in the 2000s. Eventually that fiber got used but the folks who originally paid for it got hosed.

by rmoriz2 hours ago|

parent|

prev|

[-]

And free model supply will stop…

by jayd162 hours ago|

parent|

[-]

I wonder if Google will put out a free model with the ads already baked in.

by kyboren1 hours ago|

parent|

[-]

If you mean releasing model weights: They won't, because they know the "shill something" vector will get abliterated immediately. And they can't use trade secrets or copyright to stop it, either, because they released the model themselves and you don't need to redistribute weights, just an adblocker LoRA.

by mv44 hours ago|

prev|

[-]

You just described the absolute nightmare scenario for the newly minted trillion-dollar companies whose only hope is for enterprises and SMB to move all their business processes to the cloud, with employees competing at token maxxing.

by benterix5 hours ago|

prev|

[-]

I wouldn't say "completely implode", too much money was poured int it, but it's clear we're heading in that direction. You get a model that is "good enough", plus privacy, plus savings in the long term.

Paradoxically, the better results we get from general harness of coding agents, the less moat Claude and co. get. It's unbelievably how fast some open models outpaced frontier models of just a few months ago.

by brightball5 hours ago|

parent|

[-]

I keep intending to find time to try them. What are you seeing the best results with?

by fooker2 hours ago|

prev|

[-]

If you are willing to spend about 2000 on GPUs, we are almost there.

In my opinion, the bottleneck is the package management layer and not the model capabilities and performance.

I have been an avid Linux user for decades, and if I find it confusing and painful, something is missing.

by ryandvm4 hours ago|

prev|

[-]

I disagree. We are currently in a weird period where these frontier AI companies are losing tons of money even on the subscription-based AI models. It's just too compute intensive and there's no way most people are going to be buying the kind of hardware required to run $20 worth of inference every day.

Sadly - it's going to be ads. Advertising is going to get in there and enshittify the whole thing because as always, advertising income is too easy and too plentiful for any company to resist.

Right now the models are fairly agnostic, but we are a hair-breadth away from ChatGPT responding with, "the right tool for this job is a circular saw - something like the Milwaulkee M18, which happens to be on sale at Home Depot this weekend."

by selicos3 hours ago|

parent|

[-]

$20/day x 250 days per year x # devs/agents/etc = $$$. About $5k per dev at that daily use case.

Enough to validate repurposing an existing workstation with enough RAM, or finding a used high VRAM GPU, or in my case buying a Strix Halo system for home lab and local models.

The future is once again not cloud based, for AI tools.

by zozbot2344 hours ago|

parent|

prev|

[-]

Most people are running a whole lot less than $20's worth of tokens per day on cloud platforms. (Is that assuming a frontier model? 1M output tokens per day?) Local hardware could easily take up that workload, at least the part of it that's non-time-critical.

by enoint4 hours ago|

parent|

prev|

[-]

The advertising future looks like that to me, too. Service proxies like OpenRouter might talk about price optimization, maybe some ad filtering. But I expect proxies will have malicious entries, too, surreptitiously altering agentic prompts.

by Scoundreller4 hours ago|

parent|

prev|

[-]

Ads are usually the workaround where you don’t deliver enough value to get people to subscribe or payments are unavailable for some reason.

It makes sense to show some ads and get some money at low volume (like a faraway reader wanting to read a story in your local newspaper) but taking money from regular users directly will pay much more.

Newspapers are happy to cannibalize 99% of their ad revenue with a paywall if that 1% subscribes because that’s how much more money you make from someone paying $10-$20/month vs ads.

But yeah, if people use it as a buying recommendation engine, that’s where the money is on ads/referrals but a lot of AI use has little/no connection to buying intent touchpoints.

by hylaride4 hours ago|

parent|

[-]

Newspapers had no choice after craigslist and later Google/Facebook took all their classified revenue.

LLMs may or may not be able to cover their costs with it. We'll see - I suspect product placement as recommendations will become a thing as it won't take as much GPU to give a "recommendation" on "the best widget for X". I firmly expect it to become enshittified the same way google and amazon search has.

And that's if LLMs don't become commodified.

by enoint4 hours ago|

parent|

[-]

For agentic services, how would you be able to tell that you’ve been product-placed?

by layer83 hours ago|

parent|

[-]

Hidden advertising is illegal in most jurisdictions, so it has to be indicated to the user for each specific occurrence and hence be trackable anyway.

by gowld1 hours ago|

parent|

[-]

"AI can make mistakes. Responses include sponsored content or weights."

Now it's compliant with the law.

by herval5 hours ago|

prev|

[-]

this is sorta like saying that being able to run your blog on your laptop will completely implode the cloud business

by cduzz4 hours ago|

parent|

[-]

This is actually what happens.

I run my word processing software on my apple 2 (a total joke of a computer) instead of running it on the WANG.

I run my book keeping software on visicalc instead of the IBM.

I run my simulation software on my IBM PC (I even paid for the 8087!) instead of the VAX.

Moore's law has, at least so far, allowed the pioneers with toy computers to grow their toys big enough to solve "big boy" problems after some time has allowed the toy computers to be faster and the pioneers have scaled their crappy home-grown solution to solve their 60% of the problem that was originally solved by some enormous complex system.

Eventually the toy infrastructure gets expensive and solves 90-120% of the "big iron" problem space, but it also grows to cost as much as the big iron solution, but then a new generation of toy software and toy systems emerges to disrupt the "big iron" systems.

by manoDev1 hours ago|

parent|

[-]

You're right Moore's law has been holding up, but will hit a hard limit on process node size, so all scaling will be based on multiple cores. OTH, computing per watt spent has been plateauing. If the future bottlenecks are energy and cooling, that will require infrastructure-scale solutions. My bet is this is going to be real AI company moat.

https://www.riq.net.br/pub/computing-scaling/

by ethbr14 hours ago|

parent|

prev|

[-]

Under appreciated requirement for this to work in post-cloud times: open source

If a vendor can SaaS a solution, then enterprise is generally happy (they don't want to have to hire folks for maintenance), and that completely locks out any ability to run locally.

Between enterprise's ambivalence and the obvious financial incentive to vendors, you get SaaS-only products.

by observationist4 hours ago|

parent|

prev|

[-]

It's a huge difference. If you had AI sufficiently good running locally on a phone, you could devise workflows for things like basic digital hygiene, technical assistance, and tedious tasks like inbox management, image sorting, device updates, and so on. Privacy and security gets a big boost past some local competence threshold, and we're nearly there.

Make the local AI competent enough to do good image generation and editing, realtime voice and music generation, handle agentic tasks with a framework like Hermes, and you can take your AI places to do tasks in contexts that are inaccessible to or inappropriate for cloud.

Frontier big platform models will be the best, but there's a level of "good enough" for local uses that we're already seeing flourish, and "good enough" for the average joe is almost here.

by zozbot2344 hours ago|

parent|

[-]

Phones and laptops are terrible devices for local AI, way too constrained by bad thermals and small batteries. MiniPC's (many of them using mobile hardware) don't have that particular issue, and can easily run on a 24/7 basis.

by 1 hours ago|

parent|

[-]

deleted

by trollbridge4 hours ago|

parent|

prev|

[-]

Phones are also a terrible place to run a radio, but there's a huge amount of benefit in figuring out how to do so.

by observationist1 hours ago|

parent|

[-]

That level of local AI is also more or less what you need for competent autonomous robots, too. If your household robots are orchestrated from your phone, the local security and cloud convenience converge on a single device. No extra servers, etc, reduced cost, all that - local AI is a massive market amplifier.

by grumpymuppet5 hours ago|

parent|

prev|

[-]

It's a little different because cloud and blogs didn't actively get in the way of your home compute. To wit, the various cost spikes for hardware.

People -- WANT -- this technology on their home devices and (apparently?) the providers of this tech don't seem to be running a profit so they probably don't want the maintenance tail on their side either.

I think it's a bit different. Inevitable that this becomes a household-run thing? Not likely.

by malmz5 hours ago|

parent|

prev|

[-]

Running an LLM locally is theoretically viable. Running your blog on your laptop is never viable (unless you hook it up like a server). One just requires compute while the other a stable network.

by Scoundreller4 hours ago|

parent|

[-]

tbh, my home network is pretty close to the stability of my host these days…

But my downtimes are a bit self-inflicted: changing ISPs which I can personally workaround but harder for a blog where one expects uptime.

by asdfsa324 hours ago|

parent|

prev|

[-]

The primary feature of a blog or any website is that it is available around the clock, that is the primary feature of cloud: around on the clock computer and network that scales on demand.

The primary feature of "AI" is to process information and reason with a natural language interface at speed, the primary feature of AI bigboys is to provide the machinery that runs the "models".

See the difference?

by gowld1 hours ago|

parent|

[-]

You severely underestimate how little the fraction of the performance and human labor of a frontier AI is in "the model".

Hosting a blog 24x7 on a laptop is trivial, except for hyperscaling to the front page of HN and Reddit.

by Kinrany5 hours ago|

parent|

prev|

[-]

More like implode proprietary blog hosting platforms and replace them with commodity VMs that can be used for blog hosting, among other things

by asimovDev5 hours ago|

parent|

prev|

[-]

Wouldn't arcade cabinets vs home video game consoles be a more apt comparison?

by emsign2 hours ago|

parent|

prev|

[-]

You have to consider that the enshittification factor is much higher now than in the cloud-for-free age.

by sreekanth8505 hours ago|

prev|

[-]

Curious when NVIDIA monopoly will ends. China will sure release something that can runs on commodity hardware. I wish they will soon.

by IdiotSavage5 hours ago|

prev|

[-]

I find that hard to believe. The AI companies will want to control what's possible and find new things to do that "need" their services. Otherwise it would be like Intel and Microsoft had decided in the year 2000 that computers are "good enough" now and we would have explored what's possible with that hardware ever since.

by squidbeak5 hours ago|

parent|

[-]

> Otherwise it would be like Intel and Microsoft had decided in the year 2000 that computers are "good enough" now and we would have explored what's possible with that hardware ever since.

I think you've misunderstood what good enough means in the context - which is a model capable of completing the tasks assigned to it without having the breadth of full generalization. Your analogy breaks down because of this - we did get 'good enough' spec profiles for different hardware. That thing you're wearing on your wrist won't have the same specifications as the box you use to play games.

by IdiotSavage5 hours ago|

parent|

[-]

I think you've misunderstood the analogy. Just ignore it, analogies mostly break down anyways.

> a model capable of completing the tasks assigned to it

The thing is, the "task assigned to it" is changing with improved capabilities. If everyone around you in 2036 is using general AI to do amazing stuff, you will probably have little interest in vibe coding slop like it's 2026.

by coldtea4 hours ago|

parent|

[-]

>The thing is, the "task assigned to it" is changing with improved capabilities.

Only if you give in to fads and FOMO.

The core tasks people need change at a much smaller pace.

by brookst4 hours ago|

parent|

prev|

[-]

Analogies are like metaphors, they’re illustrative rather than literal.

by benterix5 hours ago|

parent|

prev|

[-]

> The AI companies will want to control what's possible and find new things to do that "need" their services.

That's correct. The problem is they have smart people, tons of money, and several years to figure that out, and the best thing they can come up is a coding agent.

by lazide1 hours ago|

parent|

[-]

That isn’t the best thing they’ve come up with. It’s a marquee product that is fit for public consumption, however.

The ‘best’ things are; - fuzzy pattern matching algorithms for traffic analysis, human and other image target recognition.

- targeting algorithms that identify ‘suspicious’ individuals in large volumes of metadata.

- fraud analysis

- antagonistic image and video generation, both for fooling other fraud analysis, but also for propaganda, screwing with other actors, etc.

- directed high speed content generation (text, pictures, video) to spam the ‘algorithm’ and allow near realtime identification of additional buttons to push for given target audiences.

- massive marketing/ad manipulation.

Those budget line items (and the suppliers) really want to stay off the radar however, as it makes their life harder.

by coldtea4 hours ago|

parent|

prev|

[-]

>Otherwise it would be like Intel and Microsoft had decided in the year 2000 that computers are "good enough" now and we would have explored what's possible with that hardware ever since.

That would be the dream... no fucking Electron! No lockdown modules.

by dboreham5 hours ago|

prev|

[-]

Not saying this isn't the case, but my Anthropic subscription costs me less than the electricity would to power such a home inference system.

by techpression4 hours ago|

prev|

[-]

Gamers Nexus has a good video on this, but if NVIDIA exits the consumer market, and honestly why would they stay when they can charge up to a 100x for the same wafer space for enterprise, AMD would likely do the same. Only Apple really makes consumer hardware suitable for running things locally then, and maybe some weird Qualcomm ARM chip for Windows. It will be hard running things locally if nobody is supplying the hardware.