What stops you from running the best open weighted LLMs currently available on consumer grade hardware for the rest of time? They're good enough for 95% of use cases, and they don't have a used by date. From what I can see, the "danger" is not having the next tier that comes out, but the impact of that is very low.
For quite a lot of use cases, the current systems arguably do get worse over time if not continually updated. The knowledge cutoff date will start to hurt more and more as the weights age in a hypothetical scenario where you are stuck with them forever.
Coding, one of the most popular usescases today, would not be great if it say only understood java to a version from years ago etc.
Or will human readable code be less and less of a thing as AI learns it's own, more terse language to talk to other AI's.
It feels very obvious that the solution is to have a smaller model that can be trained exclusively on Java information to augment the older model. If the architecture doesn't support it currently, then that's what the architecture will look like in the future.
Otherwise you'd be arguing that, to serve users who want to an up-to-date LLM on topic X, you have to train the model on the entire ABC all over again.
It's simply ludicrous to have a coding LLM that needs to be retrained on the latest published poems and pastry recipes to generate Java.
This LLM trained only and entirely on pre-1930s texts was able to code Python programs when given only a short example:
Pockets are too deep, it will only change once everyone is out of money.
Side note though, it’s the speed that bothers me more than the reasoning. Qwen 3.5 is awesome, but my Claude subscription can tear through similar workloads an order of magnitude faster than my local LLM can when using Haiku. That’ll matter a lot to some people.
They're not at all, not even close. Especially when you consider the use cases for people who are paying for LLM services today.
Uh… the hardware requirements? And stop acting like some dog shit 8B model the average Joe can run on a laptop is even close to being comparable to what Claude or even Codex can currently do.
I have pretty good hardware and I’ve tinkered with the best sub-150B models you can use and they are awful compared to Anthropic/OAI/Grok.
honest question, i'm very interested in this, but too casual as of now to know any better.
I'm not, you've actually illustrated my point. LLMs in 2022 were very impressive. By 2024 the general public was finding them an acceptable replacement for many research driven tasks and massive shortcuts for other tasks (coding, image work, document preperation, etc).
Those models are absolutely runnable on consumer hardware now, and we were extremely happy with the results. It's no different to how we used to think CRTs were amazing or early smartphones, but going back now they seem awful.
We're long past "danger". If what we have is the best we'll ever have open source, we're already in an excellent position.
No they weren't. They were a gimmick - it is only in the past 6 or so months that frontier models have started to do stuff beyond mere gimmicks when it comes to coding, and you could make the argument that Mythos has been the first 'Holy shit' moment that we've had that has stepped us beyond 'Yeah that's really neat but...'
> Those models are absolutely runnable on consumer hardware now,
A sub 50B model is awful and can't even write proper English sentences half the time, to say nothing of how bad its world knowledge is. Try the 32B Gemma 4 local model for a week and then go back to Claude and then get back to me.
> We're long past "danger". If what we have is the best we'll ever have open source, we're already in an excellent position.
Not sure what to tell you other than that you and I have very different standards. What we have locally right now is barely more than a glorified autocomplete, and it feels worse than using ChatGPT 2 years ago because the context window is less and it doesn't have good webhooks on consumer setups. Another thing I'd say is that you clearly have no clue what 'consumer hardware' means, or what consumers that can even get this stuff running locally would have to do to get it to even rival the frontier models in terms of their usability (most consumers are't going to just boot into Ubuntu and run this thing from a command line) flow, to say nothing of the hardware requirements. I'd love to never use Claude or Gemini or ChatGPT again for both privacy and money reasons, but the quality of outputs and depth of thinking and writing ability between even the very best local models you can run right now is many orders of magnitude less than what you get using distributed frontier models, and those 'very best' local models require a top of the line machine that 99.9999% of consumers don't have and would never consider buying. The cloud models all have like a trillion(!) parameters now. It isn't even close.
I sure hope the local side of things massively improves over the next 2-3 years, but based on how this has gone my guess is that in 3 years you'll be lucky, if you have very top of the line hardware, to get benchmark performance that we had 6 months ago with the frontier models. The distributed hardware/memory gap is just too big.
Note that we are talking about 95% of everyone's use cases, not your specific use cases (which could require better models all the time).
The huge difference to open source is that you can't just train an LLM with free time and motivation. You need lots of data and a lot of compute.
I sure want to be wrong on that, I definitely like the open-weight version of the future more
In the same way you can imagine the Chinese government pushing the release of deepseek etc to make sure no one thinks the US has “won” and to keep everyone aware that a foreign model might leapfrog in the short term future etc.
At some point though if OpenAI/Antropic/Google plateau or go bust then the open source sponsorship becomes less likely, as making it open source was a weapon not a principle.
Effectively they are saying "yea don't crowd our data centers with small queries, go ahead and send your frontier questions to our frontier models. Oh btw those us models? You can run something about as good for free from us if you want hah." It's a power and marketing move. It's also insanely smart to keep up with it to remain sustainable as a brand. Especially given how small their investments into this are.
Look at anthropics growing pains. Deepseek has other hosts spreading their brand for free while they grow. Brilliant honestly. In my opinion it makes anthropic and openai look clueless on a lot of levels.
China is playing a different game here. To them this is commoditizing their compliment and building good will. The Chinese economy doesn't teter on the brink of collapse to deliver frontier grade LLMs. Nope, Alibaba just made qwen because it needs it. It needs efficient models. Similarly, in China they manufacture and automate so much more than the US ever could. LLMs to them are a topping not the whole meal like they are in the us.
They're state companies, not some kind of ethical VC charity fund project.
If the US’s fascist experiment continues past the current president, we’ll absolutely be nationalizing frontier companies or exerting equivalent control.
I'm glad I get reminded that TDS is real, but everyone forgets that Bush, Obama, and Biden all did things with executive power that Congress ignored or provided little real oversight for. And Congress has proven over the last several decades that their oversight is rather meaningless for the goals of American voters rather than special interests.
But it's all Trump's fault is much more convenient.
Absolutely not. There is huge difference in the their behaviors.
> But it's all Trump's fault is much more convenient.
It is not just Trumps fault. Trump is logical consequence of what conservative party became. J.D.Vance and Miller are as much fascists if not more. The whole party worked for this for years and created this.
> And Congress has proven over the last several decades that their oversight is rather meaningless for the goals of American voters rather than special interests.
Of course congress in general is not the place to stop republican party from their fascists goals, because republicans in the congress support Trump 100%. They stand by project 2025 100%. They are doing oversight all right when it comes to blocking democrats.
The idea that the party that made Trump big, promoted ideas he build on and created project 2025 is supposed to be counterbalance to itself is absurd.
https://try.works/#why-chinese-ai-labs-went-open-and-will-re...
It did work for Deepseek for sure and it seems to move the needle for Xiaomi's MiMo; but will it be enough for Qwen and Gemma? Those are the models you can actually run without going all-in on AI (but only with gaming GPUs and such).
The compute required to run these models is still very far out of reach for the average consumer, yet known enthusiast, therefore they still sell inference, whilst also getting consumer goodwill for providing open weights.
China? Im getting ready to watch the URKL (universal robot knockout league) go on. The USA is dicking around with failed robot dogs.
The USA has been a failed country, coasting on massive inertia. But the tech avenues from a article I cant find showed the USA 8/64 areas excelling. China was 56/64 areas excelling.
Smart people in China design fast manufacturing lines for $25k/yr.
Smart people in the US design bond hedging strategies or ad-pixel trackers for $250k/yr.
China is in the stage the US was in 60 years ago, and eventually those high paying, high impact jobs will suck the intelligence out of all the "blue collar" work. Just like it did in the US.
Dodging politics, the power structures in us industry need serious revamping.
USA exports and exported services, especially in IT. And a lot. USA has nothing to export is true only if you intentionally ignore stuff USA exports.
So, the business model of open models is the same as closed models: Sell inference. Open source is marketing for that inference.
https://try.works/#why-chinese-ai-labs-went-open-and-will-re...
The Open Source AI Definition (OSAID) is quite ridiculous, I prefer the Debian ML policy for defining freedoms around AI.
Frontier US labs could still have an advantage for a long time, but many use cases would start gravitating towards Chinese models if they 10x the data centers and provide similar quality inference for a third of the cost.
Not everything good in our society needs to have a "business model". People still work on it. It's FINE.
Donations. Have you donated lately?
Wikipedia is cheap compared to creating and training models.
I don’t think donations will suffice at all.
As an example, we had millions of web developers download and install Firebug before browsers shipped their own dev tools. Donations over the course of multiple years would have paid my salary for a month if I were not a volunteer.
But from the “it’s fine” point of view, models will be baked into your OS.
Then later models will be embedded into hardware. Likely only OS makers models.
DeepSeek said it spent $5.6M [1] on training V3, which doesn't sound too much for a near-SOTA model.
An open source entity can come up with a hybrid business model, such as requiring a small fee from those who want to host the model as a business for the first n months following the release of a new model, but making it fully free for individuals.
This is what I do not understand as well and advertising the knowledge and more advanced model is also the only thing that comes to my mind.
Since a month I am using gemma4 locally successfully on a MBP M2 for many search queries (wikipedia style questions) and it is really good, fast enough (30-40t/s) and feels nice as it keeps these queries private. But I don't understand why Google does this and so I think "we" need to find a better solution where the entire pipeline is open and the compute somehow crowdfunded. Because there will be a time when these local models will get more closed like Android is closing down. One restriction they might enforce in the future could be that they cripple the models down for "sensitive" topics like cybersecurity or health topics. Or the government could even feel the need to force them to do so.
It builds good will also. it also shows research prowess.
For China it's different. They need to show Americans who don't trust them at all because of propaganda that they have no tricks up their sleeve. It also doesn't hurt when Chinese companies drop models for free people can run at home that are about as good as sonnet. Serious mic drop.
Running AI models on local hardware was exploratory at first, and if it's so easy today it's thanks to open source. It's a little bit coincidental that we have this today, and that mainstream hardware have this capability. The fact that a phone can run very small models is exploratory or some kind of marketing opportunity at best.
Why would hardware company ships cards with more AI capabilites (like more VRAM) in the foreseable future ? On what ground does the marketing for on device AI will keep generating interest ? For something as important, it's very uncertain. But above all, it should not depends on these brittle justifications.
Showing good will in distribution and research prowess today is positive communication, but it can be exactly the oppositite if/when an attack using those small models will reach a high value target.
For China the cultural difference is so huge, it's difficult to say. I would think they first and foremost need to show to evryone inside and outside of China that they match american models. Second, i would say that when americans prefer few very powerfull companies on the get go because they can leverage a lot of capital rapidly to industrialize, China will prefer leveraging a lot of smaller companies exploring a lot of things simultanously (so doing a lot of research), THEN creating legislation to let only the best (or a few) to survive effectively. In the end it's the same result (monopoly or oligopoly), but China may have a stronger core (research) and America may have stronger productive capital, that may be proved obsolete... In the long run, in either side it's a gamble, again.
I disagree on the second point. I think most Americans don't prefer fewer competition, that's a bit antithetical to the free market.
I doubt the Chinese government cares as much about controlling a few companies as you think they do.
China has a few things going for it beyond research. They are mission driven, they actually have needs for this technology, their needs will forward their entire economy as they are the world's largest manufacturers. They are also huge exporters and have buckets of customer support for various languages.
China also has considerably stronger infrastructure for electricity, etc. even with an nividia embargo they are doing more than showing up.
I don't think it's a matter of who "wins". There is no winning. I think China stands to gain far more from LLMs than the US does, and they have proven they don't need the us to do it, even with he us trying to sabotage it's every move into the space. The game is already more or less over in my mind.
If anything I see LLMs as having a huge market in China, and now the US can't even sell it to them.
All I care about is, if I have to use this technology, let me run it locally to avoid the surveillance capitalism aspect. That seems to be the real reason the us has propped up it economy in anticipation for this technology. Yet it doesn't long term benefit the us nor me.
I don't think local will necessarily be open-weight. And then it's not that different from personal computing: you're giving up the big lucrative corporate mainframe, thin-client model for "sell copies to a ton of individuals."
So it'd be someone else (an Apple, or the next-year equivalent of 1976 Apple) who'd start eating into that. There are a few on-device things today, but not for much heavy lifting. At first it's a toy, could maybe become more realized in a still-toy-like basis like a fully-local Alexa; in the future it grows until it eats 80-90% of the OpenAI/Anthropic use cases.
Incumbents would always rather you pay a subscription or per-use forever, but if the market looks big enough, someone will try to disrupt it.
The cost to transmit text is basically free and instantaneous. The rent (i.e. a GPU in a data center) vs buy is going to favor rent until buy is a trivial expense. Like 50-100 range.
Even then a LLM that just works is easier than dealing with your own
Video game streaming is the closest thing, and it's never really taken off. (And this, IMO, is a good comparison because it's a pretty similar magnitude up-front-cost, $500-$4000.)
Once the local-AI-is-good-enough (Sonnet level for a lot of basic tasks, say) for a $1k up-front investment the appeal of having something that can chew on various tasks 24/7 w/o rate limits, API token budget charge concerns, etc, is going to unlock a lot of new approaches to problems. Essentially more fully-baked line-of-business OpenClaw-type things. Or the smart home automation bot of Siri's dreams. You can more easily make that all private and secure when all the compute is local: don't give any outside network access. Push data into the sandbox periodically via boring old scripts-on-cronjobs, vs giving any sort of "agentic" harness external access. Have extremely limited data structures for getting output/instructions back out. I'd never want to pass info about my personal finances into a third party remote model; but I'd let a local one crunch numbers on it.
Even if you need Opus/Mythos/whatever level for certain tasks, if 95% of everything else you'd pay Anthropic or OpenAI for can now be done on things you own w/o third party risk... what does that do to the investment appeal of building better AI appliances to sell end users vs building better centralized models?
I think "what if today's LLM performance, but running entirely under your control and your own hardware" opens up a LOT of interesting functionality. Crowdsource the whole world's creativity to figure out what to do with it, vs waiting for product managers and engineers at 3 individual companies to release features.
Anyways, who's spending $1k for a LLM machine when they can spend $20 (or 0) on a subscription? And who's having an LLM crunching away 24/7 anyways? Anyone who is going to do something like that probably wants a cutting edge model.
It'll (probably) get to a point where the hardware is cheap enough and advancement levels off. But we're a ways from that and even then when a data center is 20ms away why not offload heavy compute that's mostly text in text out.
How many crowdfunded projects do you know that have raised even one percent of that? Who’s going to be in charge of collecting that scale of money? Perhaps some sort of company formed for the benefit of humanity, which will promise to be a non-profit? Some sort of “Open” AI?
Oh, wait.
I can't say that you are lying and you are not exactly exaggerating either. It is true that a new SOTA model -- from literal scratch -- it would be expensive.
But, and it is not a small but, is the starting point really zero?
Much like the current Twitter model, being able to put your thumb on the scale of "truth". Bake a stronger bias towards their preferred narrative directly into the model. Could be as "benign" as training it to prefer Azure over AWS. Could be much worse.
Sometimes there are things where the public good is best served with public expenditure.
Not every country is in a crypto-libertarian race to hoard power and wealth.
Meanwhile, in the EU, the model would be collectively financed, trained by a competent, neutral agency... and then completely lobotomized in the name of "the children," "safety," "IP rights," "correct speech," dozens of individual countries' legal and regulatory requirements, and any number of additional vocal, noncontributing NGOs.
So no one would get rich off of the public model, but no one would get much of anything else out of it, either.
As another reply suggests, there's a reason why things happen in the USA first. Even when they don't, the prime movers move here as soon as they can. Or at least they used to.
1. Innovate, create, and offer it all at sweetheart prices to the public while you rack up debt.
2. Shovel in more money and either buy out or outlast the competition. Become dominant. Lock in your users any which way you can.
3. Enshittify and cash in.
The deals Anthropic, OpenAI, etc. offer won't stay this good much longer. Don't let them lock you in. Failing that, you should budget more for the same service. You're going to need it. Having an open alternative running on your own hardware offers non-negligible peace of mind.
Read through a 1970s-era issue of Popular Electronics or Byte, and then spend some time surfing /r/LocalLlama. You'll get a sense of real-time deja vu, like you're watching history unfold again.
How serious a risk is poisoned weights?
Can we leverage the cryptobros into using LLM training as a proof of work?
Having an LLM use a web search tool isn't the same thing as researching a topic, IMO, because it's so ephemeral and needs constant reinforcement. LLMs aren't learning machines, they're static ones.