undefined

upvote

points

by TheJCDenton15 hours ago |

upvote

by apublicfrog14 hours ago|

[-]

> It's a very dangerous gamble. Today incredible value is available for nearly everyone. But it may stop without any warning, for reason outside our control.

What stops you from running the best open weighted LLMs currently available on consumer grade hardware for the rest of time? They're good enough for 95% of use cases, and they don't have a used by date. From what I can see, the "danger" is not having the next tier that comes out, but the impact of that is very low.

reply

upvote

by giobox14 hours ago|

[-]

> they don't have a used by date

For quite a lot of use cases, the current systems arguably do get worse over time if not continually updated. The knowledge cutoff date will start to hurt more and more as the weights age in a hypothetical scenario where you are stuck with them forever.

Coding, one of the most popular usescases today, would not be great if it say only understood java to a version from years ago etc.

https://en.wikipedia.org/wiki/Knowledge_cutoff

reply

upvote

by throwyawayyyy13 hours ago|

[-]

One solution is not to advance anything of course. I'm not even joking, is there going to be a successor to React? I suspect not, with the vast amount of training data for React now, it's going to look silly to move to something else with less support. What is the last new popular programming language, rust? Will there be another one? I suspect not. Same reasoning. The irony of all this AI acceleration talk is it'll work best if we don't accelerate the underlying tech at all.

reply

upvote

by WarmWash11 hours ago|

[-]

There probably won't be new stuff so much as trends in how stuff is done, and updates around optimizing those trends.

reply

upvote

by jvm___11 hours ago|

[-]

Will programming languages evolve into less human oriented written code and more just calls to a trusted AI.

Or will human readable code be less and less of a thing as AI learns it's own, more terse language to talk to other AI's.

reply

upvote

by digitaltrees10 hours ago|

[-]

Yes. I am seeing a big push to use vanilla js for single file html apps that are easy to build, deploy and distribute because they have no build step. I could see component libraries emerging that make it easier build from chat interfaces with less ceremony

reply

upvote

by byzantinegene9 hours ago|

[-]

i'm not sure the tradeoff in code readability is worth it as of now.

reply

upvote

by hadlock11 hours ago|

[-]

Name/post content combo on point

reply

upvote

by Spooky2311 hours ago|

[-]

Alot of the language work is scratching the itch of engineers and developers. I think you’re correct and react is the new COBOL.

reply

upvote

by apsurd11 hours ago|

[-]

Humans are notoriously bad at predicting the future. Toward that end, your prediction is laughable. React is the end all be all of UI… lol

reply

upvote

by melagonster11 hours ago|

[-]

Programmers won't be allow to exist in future. Vibe coding is the final resolution people can apply.

reply

upvote

by rrvsh13 hours ago|

[-]

Nobody is unaware of the knowledge cutoff, and sharing the Wikipedia article is not helping anyone. Your point is easily rebutted by taking whatever open weights/source model has an outdated cutoff and training or fine tuning it on more data, which is again always going to be viable given a modicum of compute

reply

upvote

by AlienRobot42 minutes ago|

[-]

I genuinely don't understand how can this possibly be a problem long term.

It feels very obvious that the solution is to have a smaller model that can be trained exclusively on Java information to augment the older model. If the architecture doesn't support it currently, then that's what the architecture will look like in the future.

Otherwise you'd be arguing that, to serve users who want to an up-to-date LLM on topic X, you have to train the model on the entire ABC all over again.

It's simply ludicrous to have a coding LLM that needs to be retrained on the latest published poems and pastry recipes to generate Java.

reply

upvote

by tcp_handshaker13 hours ago|

[-]

You could learn how to code...a whole generation did it before...

reply

upvote

by mrtesthah12 hours ago|

[-]

>Coding, one of the most popular uses cases today, would not be great if it say only understood java to a version from years ago etc.

This LLM trained only and entirely on pre-1930s texts was able to code Python programs when given only a short example:

https://talkie-lm.com/introducing-talkie

reply

upvote

by nullc12 hours ago|

[-]

Small models are more useful for "doing stuff" than "knowing stuff" to begin with. Add in an agentic harness and a small model can happily read more current information on demand (including from e.g. a local wikipedia snapshot).

reply

upvote

by moffkalast53 minutes ago|

[-]

Ha yes I used to think this was not a notable issue, but just today I was getting qwen 3.5 to fix my network drivers and it immediately freaked out like: "kernel 6.17, what the fuck? that doesn't exist yet!". It almost had a mental breakdown over that detail and derailed the conversation towards checking what's wrong with the kernel version reporting lol.

reply

upvote

by turtlebits13 hours ago|

[-]

FOMO. A new model comes out weekly and the HN crowd debates over the minutia of changes.

Pockets are too deep, it will only change once everyone is out of money.

reply

upvote

by 3eb7988a16638 hours ago|

[-]

What is really amusing to me is how N months ago, the latest SOTA was incredible, but now utterly unusable. Feels like there is a model reality-distortion field in play where people can only acknowledge the flaws in retrospect.

reply

upvote

by lxgr13 hours ago|

[-]

They’re really not good enough, unless you consider 64 GB of memory or more consumer grade.

reply

upvote

by steve_adams_8613 hours ago|

[-]

I’m pretty happy with what a 32GB Mac Studio can do for a lot of tasks. They’re the things I’d throw a model like Haiku at, but still genuinely useful. We don’t have an answer to frontier models in the consumer range yet, but we’re not totally trapped.

Side note though, it’s the speed that bothers me more than the reasoning. Qwen 3.5 is awesome, but my Claude subscription can tear through similar workloads an order of magnitude faster than my local LLM can when using Haiku. That’ll matter a lot to some people.

reply

upvote

by datadrivenangel12 hours ago|

[-]

Yeah this is the real killer. slower and more expensive is tough.

reply

upvote

by root_axis9 hours ago|

[-]

> They're good enough for 95% of use cases

They're not at all, not even close. Especially when you consider the use cases for people who are paying for LLM services today.

reply

upvote

by nightski13 hours ago|

[-]

Hardware. Frontier labs are driving up demand so much that it's priced significantly above cost making it far less affordable. Just look at Nvidia's profit margins.

reply

upvote

by suika14 hours ago|

[-]

The use cases in the future will be nothing like the use cases from today.

reply

upvote

by apublicfrog7 hours ago|

[-]

Maybe. The use cases people primarily use LLMs for (documents, coding, design, research) existed decades ago with different tooling. Who knows if the future will have a slew of new problems that require new models or will continue to be similar?

reply

upvote

by avazhi11 hours ago|

[-]

> What stops you from running the best open weighted LLMs currently available on consumer grade hardware for the rest of time?

Uh… the hardware requirements? And stop acting like some dog shit 8B model the average Joe can run on a laptop is even close to being comparable to what Claude or even Codex can currently do.

I have pretty good hardware and I’ve tinkered with the best sub-150B models you can use and they are awful compared to Anthropic/OAI/Grok.

reply

upvote

by apsurd11 hours ago|

[-]

What if the harness and loops get sufficiently better though? CC is using haiku for code-base gripping and such, you don't see a local commodity model being "good enough" for the 80% case when matched with better harnesses and tool calls?

honest question, i'm very interested in this, but too casual as of now to know any better.

reply

upvote

by byzantinegene8 hours ago|

[-]

vast majority of average users don't use llms for coding, and for those purposes, local llms with low param count are a far cry from SOTA models.

reply

upvote

by apublicfrog7 hours ago|

[-]

> And stop acting like some dog shit 8B model the average Joe can run on a laptop is even close to being comparable to what Claude or even Codex can currently do.

I'm not, you've actually illustrated my point. LLMs in 2022 were very impressive. By 2024 the general public was finding them an acceptable replacement for many research driven tasks and massive shortcuts for other tasks (coding, image work, document preperation, etc).

Those models are absolutely runnable on consumer hardware now, and we were extremely happy with the results. It's no different to how we used to think CRTs were amazing or early smartphones, but going back now they seem awful.

We're long past "danger". If what we have is the best we'll ever have open source, we're already in an excellent position.

reply

upvote

by avazhi4 hours ago|

[-]

> LLMs in 2022 were very impressive.

No they weren't. They were a gimmick - it is only in the past 6 or so months that frontier models have started to do stuff beyond mere gimmicks when it comes to coding, and you could make the argument that Mythos has been the first 'Holy shit' moment that we've had that has stepped us beyond 'Yeah that's really neat but...'

> Those models are absolutely runnable on consumer hardware now,

A sub 50B model is awful and can't even write proper English sentences half the time, to say nothing of how bad its world knowledge is. Try the 32B Gemma 4 local model for a week and then go back to Claude and then get back to me.

> We're long past "danger". If what we have is the best we'll ever have open source, we're already in an excellent position.

Not sure what to tell you other than that you and I have very different standards. What we have locally right now is barely more than a glorified autocomplete, and it feels worse than using ChatGPT 2 years ago because the context window is less and it doesn't have good webhooks on consumer setups. Another thing I'd say is that you clearly have no clue what 'consumer hardware' means, or what consumers that can even get this stuff running locally would have to do to get it to even rival the frontier models in terms of their usability (most consumers are't going to just boot into Ubuntu and run this thing from a command line) flow, to say nothing of the hardware requirements. I'd love to never use Claude or Gemini or ChatGPT again for both privacy and money reasons, but the quality of outputs and depth of thinking and writing ability between even the very best local models you can run right now is many orders of magnitude less than what you get using distributed frontier models, and those 'very best' local models require a top of the line machine that 99.9999% of consumers don't have and would never consider buying. The cloud models all have like a trillion(!) parameters now. It isn't even close.

I sure hope the local side of things massively improves over the next 2-3 years, but based on how this has gone my guess is that in 3 years you'll be lucky, if you have very top of the line hardware, to get benchmark performance that we had 6 months ago with the frontier models. The distributed hardware/memory gap is just too big.

reply

upvote

by ai_fry_ur_brain12 hours ago|

[-]

95% of usecases. What are you smoking.

reply

upvote

by selcuka10 hours ago|

[-]

There are very good open weight models (such as DeepSeek v4 Flash) that can run on consumer level hardware.

Note that we are talking about 95% of everyone's use cases, not your specific use cases (which could require better models all the time).

reply

upvote

by 11 hours ago|

[-]

deleted

reply

upvote

by oytis15 hours ago|

[-]

What is the business model of open weight AI? I don't think there is any. At best it can serve as an advertisement for the more advanced models you sell.

The huge difference to open source is that you can't just train an LLM with free time and motivation. You need lots of data and a lot of compute.

I sure want to be wrong on that, I definitely like the open-weight version of the future more

reply

upvote

by wood_spirit15 hours ago|

[-]

Meta released Llama just when OpenAI was so hot and its valuation was going through the roof. Speculating, but Meta probably thought the model not competitive enough to keep as a secret weapon but well good enough to commercially damage OpenAI who were a sudden competitor for most-valued-company?

In the same way you can imagine the Chinese government pushing the release of deepseek etc to make sure no one thinks the US has “won” and to keep everyone aware that a foreign model might leapfrog in the short term future etc.

At some point though if OpenAI/Antropic/Google plateau or go bust then the open source sponsorship becomes less likely, as making it open source was a weapon not a principle.

reply

upvote

by 2ndorderthought14 hours ago|

[-]

I disagree. I think deepseek, qwen, and kimi earn a lot of trust open sourcing their models. While still profiting.

Effectively they are saying "yea don't crowd our data centers with small queries, go ahead and send your frontier questions to our frontier models. Oh btw those us models? You can run something about as good for free from us if you want hah." It's a power and marketing move. It's also insanely smart to keep up with it to remain sustainable as a brand. Especially given how small their investments into this are.

Look at anthropics growing pains. Deepseek has other hosts spreading their brand for free while they grow. Brilliant honestly. In my opinion it makes anthropic and openai look clueless on a lot of levels.

China is playing a different game here. To them this is commoditizing their compliment and building good will. The Chinese economy doesn't teter on the brink of collapse to deliver frontier grade LLMs. Nope, Alibaba just made qwen because it needs it. It needs efficient models. Similarly, in China they manufacture and automate so much more than the US ever could. LLMs to them are a topping not the whole meal like they are in the us.

reply

upvote

by WarmWash11 hours ago|

[-]

The Chinese labs don't have to make money or be profitable. They are funded by the state to achieve the state's goals, and the global praise of their open models just serves as Chinese soft power.

They're state companies, not some kind of ethical VC charity fund project.

reply

upvote

by 2ndorderthought11 hours ago|

[-]

The fun part is, they are making money and have way less to pay off despite 100s of billions in donations than the US companies do.

reply

upvote

by Spooky2311 hours ago|

[-]

Is it so different?

If the US’s fascist experiment continues past the current president, we’ll absolutely be nationalizing frontier companies or exerting equivalent control.

reply

upvote

by treis11 hours ago|

[-]

Yes, China is very different from the US.

reply

upvote

by ThunderSizzle10 hours ago|

[-]

Sigh. Obama and Biden were as every bit "fascist" as Trump.

I'm glad I get reminded that TDS is real, but everyone forgets that Bush, Obama, and Biden all did things with executive power that Congress ignored or provided little real oversight for. And Congress has proven over the last several decades that their oversight is rather meaningless for the goals of American voters rather than special interests.

But it's all Trump's fault is much more convenient.

reply

upvote

by watwut4 hours ago|

[-]

> Sigh. Obama and Biden were as every bit "fascist" as Trump.

Absolutely not. There is huge difference in the their behaviors.

> But it's all Trump's fault is much more convenient.

It is not just Trumps fault. Trump is logical consequence of what conservative party became. J.D.Vance and Miller are as much fascists if not more. The whole party worked for this for years and created this.

> And Congress has proven over the last several decades that their oversight is rather meaningless for the goals of American voters rather than special interests.

Of course congress in general is not the place to stop republican party from their fascists goals, because republicans in the congress support Trump 100%. They stand by project 2025 100%. They are doing oversight all right when it comes to blocking democrats.

The idea that the party that made Trump big, promoted ideas he build on and created project 2025 is supposed to be counterbalance to itself is absurd.

reply

upvote

by platevoltage8 hours ago|

[-]

Certainly Biden and Obama check off a few of the 14 points of Fascism, but are we really being serious here? "TDS" is just a thought terminating cliche.

reply

upvote

by try-working13 hours ago|

[-]

Correct. Open source is a PR and marketing strategy for new labs, regardless of origin.

https://try.works/#why-chinese-ai-labs-went-open-and-will-re...

reply

upvote

by D2OQZG8l5BI1S069 hours ago|

[-]

Interesting article, but Qwen does seem to be closing off. They don't release big variants anymore, and I'm not sure that the fact the local-LLM community keeps praising it actually increases the number of people using their API.

It did work for Deepseek for sure and it seems to move the needle for Xiaomi's MiMo; but will it be enough for Qwen and Gemma? Those are the models you can actually run without going all-in on AI (but only with gaming GPUs and such).

reply

upvote

by try-working9 hours ago|

[-]

Definitely. Open releases will accelerate this year, including from Qwen because they're behind in adoption.

reply

upvote

by HDBaseT12 hours ago|

[-]

You can still make money on open weight models.

The compute required to run these models is still very far out of reach for the average consumer, yet known enthusiast, therefore they still sell inference, whilst also getting consumer goodwill for providing open weights.

reply

upvote

by datadrivenangel12 hours ago|

[-]

And the efficiency! Big accelerator cards are ~100x the throughput per watt in terms of raw processing power.

reply

upvote

by mystraline13 hours ago|

[-]

Thats because the USA has really nothing big to export. Yay, designs.

China? Im getting ready to watch the URKL (universal robot knockout league) go on. The USA is dicking around with failed robot dogs.

The USA has been a failed country, coasting on massive inertia. But the tech avenues from a article I cant find showed the USA 8/64 areas excelling. China was 56/64 areas excelling.

reply

upvote

by WarmWash11 hours ago|

[-]

China is an advanced 2nd world country with pockets of first world.

Smart people in China design fast manufacturing lines for $25k/yr.

Smart people in the US design bond hedging strategies or ad-pixel trackers for $250k/yr.

China is in the stage the US was in 60 years ago, and eventually those high paying, high impact jobs will suck the intelligence out of all the "blue collar" work. Just like it did in the US.

reply

upvote

by 2ndorderthought13 hours ago|

[-]

I believe it. The us intentionally lacks accountability to prop up the already wealthy in almost all of its ventures. Which socializes losses and capitalizes gains. It's an economic model that guarantees deterioration and stagnation.

Dodging politics, the power structures in us industry need serious revamping.

reply

upvote

by mrleinad12 hours ago|

[-]

China is going to be the next Germany: a loser in the new world without globalization

reply

upvote

by watwut4 hours ago|

[-]

> Thats because the USA has really nothing big to export. Yay, designs.

USA exports and exported services, especially in IT. And a lot. USA has nothing to export is true only if you intentionally ignore stuff USA exports.

reply

upvote

by sillysaurusx12 hours ago|

[-]

If this is true, then why are most of the companies that change the world founded in the US?

reply

upvote

by try-working13 hours ago|

[-]

Open sourcing models is a marketing strategy. Chinese labs and small international labs have no awareness or distribution, so unless they become a hot topic for a while, nobody is going to bother trying out their models. Open source gets them that, and is essentially a tax on newcomers. When you start out you simply have no other option but to open source your models.

So, the business model of open models is the same as closed models: Sell inference. Open source is marketing for that inference.

https://try.works/#why-chinese-ai-labs-went-open-and-will-re...

reply

upvote

by pabs311 hours ago|

[-]

None of these models are open source, they are just public weights, with licensing that sometimes but usually doesn't meet the Open Source Definition.

The Open Source AI Definition (OSAID) is quite ridiculous, I prefer the Debian ML policy for defining freedoms around AI.

https://salsa.debian.org/deeplearning-team/ml-policy/

reply

upvote

by kranke15512 hours ago|

[-]

China’s long term goal might just be to own the chip layer alongside everything else, and outproduce the US in data centers.

Frontier US labs could still have an advantage for a long time, but many use cases would start gravitating towards Chinese models if they 10x the data centers and provide similar quality inference for a third of the cost.

reply

upvote

by js814 hours ago|

[-]

What is the business model of Wikipedia? I don't think there is any.

Not everything good in our society needs to have a "business model". People still work on it. It's FINE.

reply

upvote

by sroussey14 hours ago|

[-]

> What is the business model of Wikipedia?

Donations. Have you donated lately?

Wikipedia is cheap compared to creating and training models.

I don’t think donations will suffice at all.

As an example, we had millions of web developers download and install Firebug before browsers shipped their own dev tools. Donations over the course of multiple years would have paid my salary for a month if I were not a volunteer.

But from the “it’s fine” point of view, models will be baked into your OS.

Then later models will be embedded into hardware. Likely only OS makers models.

reply

upvote

by selcuka10 hours ago|

[-]

> Wikipedia is cheap compared to creating and training models.

DeepSeek said it spent $5.6M [1] on training V3, which doesn't sound too much for a near-SOTA model.

An open source entity can come up with a hybrid business model, such as requiring a small fee from those who want to host the model as a business for the first n months following the release of a new model, but making it fully free for individuals.

[1] https://arxiv.org/pdf/2412.19437

reply

upvote

by avidphantasm14 hours ago|

[-]

Ultimately, information is a public good: it is non-excludable (you can’t stop people from using it) and it is non-rival (we can all use it at the same time). Public goods are often very useful, and because they are non-excludable and non-rival, ultimately can’t have a market-based business model. I would class open-weights AI models as public goods, and would support government expenditure to produce them.

reply

upvote

by phainopepla214 hours ago|

[-]

Training AI models is capital intensive, though. Unless there's some sort of mega-crowdfunding effort for open weight model training there needs to be a way to recoup that money on the other end. Either that or state sponsorship I guess

reply

upvote

by PAndreew15 hours ago|

[-]

Perhaps you can create a compelling UX around it and sell it as a subscription. "Normies" will not be able/willing to build it. You can then patch the model/ship new features around it as it evolves. For example I have built an ambient todo list / health data extractor using Gemma 4 2EB and Whisper. Nothing to brag about but it does fairly decent job even in foreign languages.

reply

upvote

by karussell15 hours ago|

[-]

> What is the business model of open weight AI?

This is what I do not understand as well and advertising the knowledge and more advanced model is also the only thing that comes to my mind.

Since a month I am using gemma4 locally successfully on a MBP M2 for many search queries (wikipedia style questions) and it is really good, fast enough (30-40t/s) and feels nice as it keeps these queries private. But I don't understand why Google does this and so I think "we" need to find a better solution where the entire pipeline is open and the compute somehow crowdfunded. Because there will be a time when these local models will get more closed like Android is closing down. One restriction they might enforce in the future could be that they cripple the models down for "sensitive" topics like cybersecurity or health topics. Or the government could even feel the need to force them to do so.

reply

upvote

by 2ndorderthought15 hours ago|

[-]

Why would you want to try to support all users simple queries on your ai data center if they could run it on their own computer?

It builds good will also. it also shows research prowess.

For China it's different. They need to show Americans who don't trust them at all because of propaganda that they have no tricks up their sleeve. It also doesn't hurt when Chinese companies drop models for free people can run at home that are about as good as sonnet. Serious mic drop.

reply

upvote

by TheJCDenton14 hours ago|

[-]

Very good point on using local ai to avoid data centers costs.

Running AI models on local hardware was exploratory at first, and if it's so easy today it's thanks to open source. It's a little bit coincidental that we have this today, and that mainstream hardware have this capability. The fact that a phone can run very small models is exploratory or some kind of marketing opportunity at best.

Why would hardware company ships cards with more AI capabilites (like more VRAM) in the foreseable future ? On what ground does the marketing for on device AI will keep generating interest ? For something as important, it's very uncertain. But above all, it should not depends on these brittle justifications.

Showing good will in distribution and research prowess today is positive communication, but it can be exactly the oppositite if/when an attack using those small models will reach a high value target.

For China the cultural difference is so huge, it's difficult to say. I would think they first and foremost need to show to evryone inside and outside of China that they match american models. Second, i would say that when americans prefer few very powerfull companies on the get go because they can leverage a lot of capital rapidly to industrialize, China will prefer leveraging a lot of smaller companies exploring a lot of things simultanously (so doing a lot of research), THEN creating legislation to let only the best (or a few) to survive effectively. In the end it's the same result (monopoly or oligopoly), but China may have a stronger core (research) and America may have stronger productive capital, that may be proved obsolete... In the long run, in either side it's a gamble, again.

reply

upvote

by 2ndorderthought13 hours ago|

[-]

They have already shown that their models match or excel over American ones in different cases. For cheaper too.

I disagree on the second point. I think most Americans don't prefer fewer competition, that's a bit antithetical to the free market.

I doubt the Chinese government cares as much about controlling a few companies as you think they do.

China has a few things going for it beyond research. They are mission driven, they actually have needs for this technology, their needs will forward their entire economy as they are the world's largest manufacturers. They are also huge exporters and have buckets of customer support for various languages.

China also has considerably stronger infrastructure for electricity, etc. even with an nividia embargo they are doing more than showing up.

I don't think it's a matter of who "wins". There is no winning. I think China stands to gain far more from LLMs than the US does, and they have proven they don't need the us to do it, even with he us trying to sabotage it's every move into the space. The game is already more or less over in my mind.

If anything I see LLMs as having a huge market in China, and now the US can't even sell it to them.

All I care about is, if I have to use this technology, let me run it locally to avoid the surveillance capitalism aspect. That seems to be the real reason the us has propped up it economy in anticipation for this technology. Yet it doesn't long term benefit the us nor me.

reply

upvote

by codebje11 hours ago|

[-]

I'd expect unified memory architectures (Apple M-series, AMD Ryzen AI series, etc) to be the future of local inference, not GPU cards.

reply

upvote

by 2ndorderthought11 hours ago|

[-]

Time will tell. Depends on small model architecture trends and hardware availability. I wouldn't be surprised if something came slightly out of left field. Considering Taiwan is trapped into producing the same chips for the next 2 years, I wouldn't be surprised if a new player emerged.

reply

upvote

by karussell15 hours ago|

[-]

Indeed cost can be another factor. Maybe also the main reason why Chrome added an offline model.

reply

upvote

by 2ndorderthought14 hours ago|

[-]

That and it's lucrative for Android/chrome to have a text summarizer model embedded on your phone probably for government contracts and data exfil but we won't go through there.

reply

upvote

by 15 hours ago|

[-]

deleted

reply

upvote

by majormajor14 hours ago|

[-]

> What is the business model of open weight AI? I don't think there is any. At best it can serve as an advertisement for the more advanced models you sell.

I don't think local will necessarily be open-weight. And then it's not that different from personal computing: you're giving up the big lucrative corporate mainframe, thin-client model for "sell copies to a ton of individuals."

So it'd be someone else (an Apple, or the next-year equivalent of 1976 Apple) who'd start eating into that. There are a few on-device things today, but not for much heavy lifting. At first it's a toy, could maybe become more realized in a still-toy-like basis like a fully-local Alexa; in the future it grows until it eats 80-90% of the OpenAI/Anthropic use cases.

Incumbents would always rather you pay a subscription or per-use forever, but if the market looks big enough, someone will try to disrupt it.

reply

upvote

by treis13 hours ago|

[-]

Compute has gone back and forth from mainframe/thin client to fat client a few times already. LLMs will probably follow at some point but I think it's going to take a long time.

The cost to transmit text is basically free and instantaneous. The rent (i.e. a GPU in a data center) vs buy is going to favor rent until buy is a trivial expense. Like 50-100 range.

Even then a LLM that just works is easier than dealing with your own

reply

upvote

by majormajor9 hours ago|

[-]

Storage has moved back and forth but I don't thnk compute has ever really gone back to thin client. Even Gmail, Google Docs, etc are running a buttload of javascript on the user device. Various attempts at avoiding that (remote .NET or JVM stuff on early "smart-ish" phones) crashed and burned.

Video game streaming is the closest thing, and it's never really taken off. (And this, IMO, is a good comparison because it's a pretty similar magnitude up-front-cost, $500-$4000.)

Once the local-AI-is-good-enough (Sonnet level for a lot of basic tasks, say) for a $1k up-front investment the appeal of having something that can chew on various tasks 24/7 w/o rate limits, API token budget charge concerns, etc, is going to unlock a lot of new approaches to problems. Essentially more fully-baked line-of-business OpenClaw-type things. Or the smart home automation bot of Siri's dreams. You can more easily make that all private and secure when all the compute is local: don't give any outside network access. Push data into the sandbox periodically via boring old scripts-on-cronjobs, vs giving any sort of "agentic" harness external access. Have extremely limited data structures for getting output/instructions back out. I'd never want to pass info about my personal finances into a third party remote model; but I'd let a local one crunch numbers on it.

Even if you need Opus/Mythos/whatever level for certain tasks, if 95% of everything else you'd pay Anthropic or OpenAI for can now be done on things you own w/o third party risk... what does that do to the investment appeal of building better AI appliances to sell end users vs building better centralized models?

I think "what if today's LLM performance, but running entirely under your control and your own hardware" opens up a LOT of interesting functionality. Crowdsource the whole world's creativity to figure out what to do with it, vs waiting for product managers and engineers at 3 individual companies to release features.

reply

upvote

by treis9 hours ago|

[-]

There was a time where people ran software on their computer with limited connectivity. Late 90s/early 2000s most of what you did was running locally on your machine. Your emails would be downloaded and there'd be a shared drive but otherwise all local.

Anyways, who's spending $1k for a LLM machine when they can spend $20 (or 0) on a subscription? And who's having an LLM crunching away 24/7 anyways? Anyone who is going to do something like that probably wants a cutting edge model.

It'll (probably) get to a point where the hardware is cheap enough and advancement levels off. But we're a ways from that and even then when a data center is 20ms away why not offload heavy compute that's mostly text in text out.

reply

upvote

by zozbot23412 hours ago|

[-]

Except that buy is a trivial expense because the hardware has been bought already. You've got a whole lot of iGPU and dGPU silicon that's currently sitting idle as part of consumer devices and could be working on local AI inference under the end user's control.

reply

upvote

by worldsayshi15 hours ago|

[-]

It should be feasible to crowd fund training runs right?

reply

upvote

by dmd15 hours ago|

[-]

A training run costs somewhere in the neighborhood of a billion dollars. That’s a thousand millions.

How many crowdfunded projects do you know that have raised even one percent of that? Who’s going to be in charge of collecting that scale of money? Perhaps some sort of company formed for the benefit of humanity, which will promise to be a non-profit? Some sort of “Open” AI?

Oh, wait.

reply

upvote

by derektank11 hours ago|

[-]

It’s well within the capabilities of governments in developed countries. If Mistral did not already exist, I would definitely expect the French government to invest in a national LLM, if only because of how defensive they are of the French language.

reply

upvote

by iugtmkbdfil83415 hours ago|

[-]

<< That’s a thousand millions.

I can't say that you are lying and you are not exactly exaggerating either. It is true that a new SOTA model -- from literal scratch -- it would be expensive.

But, and it is not a small but, is the starting point really zero?

reply

upvote

by thefounder7 hours ago|

[-]

Cloud providers have incentives to release open source models but for some reasons this happens only in China. Amazon, Azure, Google benefit from open source models because people run them on their hardware.

reply

upvote

by sumeno14 hours ago|

[-]

If a local model hits critical mass the business model is to use it to shape opinions in a way that is advantageous for the company/owners.

Much like the current Twitter model, being able to put your thumb on the scale of "truth". Bake a stronger bias towards their preferred narrative directly into the model. Could be as "benign" as training it to prefer Azure over AWS. Could be much worse.

reply

upvote

by dleslie14 hours ago|

[-]

This is where government funding can play a role.

Sometimes there are things where the public good is best served with public expenditure.

reply

upvote

by CamperBob213 hours ago|

[-]

"Government funding" these days would mean that Trump pays Elon Musk (or more likely vice versa) to make Grok 4.20 the only legal LLM for use by Americans.

reply

upvote

by dleslie13 hours ago|

[-]

Outside of the USA it would not look like a wealth transfer to an oligarch.

Not every country is in a crypto-libertarian race to hoard power and wealth.

reply

upvote

by CamperBob211 hours ago|

[-]

Not every country is in a crypto-libertarian race to hoard power and wealth.

Meanwhile, in the EU, the model would be collectively financed, trained by a competent, neutral agency... and then completely lobotomized in the name of "the children," "safety," "IP rights," "correct speech," dozens of individual countries' legal and regulatory requirements, and any number of additional vocal, noncontributing NGOs.

So no one would get rich off of the public model, but no one would get much of anything else out of it, either.

As another reply suggests, there's a reason why things happen in the USA first. Even when they don't, the prime movers move here as soon as they can. Or at least they used to.

reply

upvote

by fragmede14 hours ago|

[-]

The business model is the total lack of attention to Qwen and Kimi that would happen if their models weren't downloadable. Before releasing the weights, there was basically zero attention paid in the western hemisphere to them, for whatever reason. By releasing the weights, they're relevant in the western world. The business model is to get people in the West to pay to use their platform hosting their AI, that otherwise would never have heard of them. As you said, advertising/marketing, essentially.

reply

upvote

by codebje11 hours ago|

[-]

Baidu have a lot of services I've never heard of, that are highly successful in China. The lack of interest in expanding into Western audiences doesn't seem to matter there - what's different about inference?

reply

upvote

by digitaltrees11 hours ago|

[-]

Exactly this. The assumption that your access will last is very risky. Or that Chinese companies will keep trying to erode the economic viability of American models by open sourcing the reversed engineered models for ever is naive.

reply

upvote

by ios-contractor12 hours ago|

[-]

I don't think it should be local vs cloud AI. I think local AI should be treated as a separate product. local ai should do things that really don't need cloud AI, then cloud AI should be used as a fallback. That would reduce a lot of costs

reply

upvote

by slicktux14 hours ago|

[-]

I’m just waiting for the US Government to implement their own local AI. Which will eventually lead to them open sourcing it because it’s tax payer funded and being that the NSA has decades worth of internet data they can train on; open weights would be just as good as any companies…

reply

upvote

by fragmede6 hours ago|

[-]

with this administration?

reply

upvote

by beloch10 hours ago|

[-]

Keep the Silicon Valley pattern in mind:

1. Innovate, create, and offer it all at sweetheart prices to the public while you rack up debt.

2. Shovel in more money and either buy out or outlast the competition. Become dominant. Lock in your users any which way you can.

3. Enshittify and cash in.

The deals Anthropic, OpenAI, etc. offer won't stay this good much longer. Don't let them lock you in. Failing that, you should budget more for the same service. You're going to need it. Having an open alternative running on your own hardware offers non-negligible peace of mind.

reply

upvote

by aabhay15 hours ago|

[-]

Disagree with this. When cost becomes an important factor or the free but worse option becomes compelling and accessible (i.e. on device agent via apple style UX), there has been significant user behavior towards local. Think about stuff like removing backgrounds from photos, OCR on PDFs, who uses paid services for casual usage of these things?

reply

upvote

by furyofantares13 hours ago|

[-]

What's the gamble here exactly? What agency do we have in it right now?

reply

upvote

by iLoveOncall14 hours ago|

[-]

The mainstream audience does not have the faintest idea that "local AI" is even a thing.

reply

upvote

by CamperBob214 hours ago|

[-]

Just as their counterparts in 1975 had no idea that "personal computers" were even a thing.

Read through a 1970s-era issue of Popular Electronics or Byte, and then spend some time surfing /r/LocalLlama. You'll get a sense of real-time deja vu, like you're watching history unfold again.

reply

upvote

by irishcoffee14 hours ago|

[-]

I own 2 5070TI cards in a rig I would gladly donate time to for a distributed training model effort. The kicker is the training data. I would want to gate the data to anything before 2022. I don’t know how to coordinate that, but I would really like to be involved in something like this. SETI, for LLMs.

reply

upvote

by AlexCoventry13 hours ago|

[-]

Bandwidth is the killer, in distributed LLM training.

reply

upvote

by irishcoffee12 hours ago|

[-]

What’s the rush?

reply

upvote

by codebje11 hours ago|

[-]

It depends on the purpose for the model. AFAIK LLMs aren't particularly capable at researching answers, relying more on having 'truth' baked in to their weights, so if it takes 12 months to train up a crowd-trained LLM it'll be 12 months behind the times.

How serious a risk is poisoned weights?

Can we leverage the cryptobros into using LLM training as a proof of work?

reply

upvote

by MarsIronPI9 hours ago|

[-]

What? I use Qwen 3.5 35B-A3B and it definitely knows how and when to do web searches to fill in gaps in its knowledge.

reply

upvote

by codebje7 hours ago|

[-]

Does Qwen3.5 know it needs to do this because the API in question has had loads of churn and much of its training data is on obsolete versions, or do you need to prompt it? How well does it handle having an API reference with sample code in its context window?

Having an LLM use a web search tool isn't the same thing as researching a topic, IMO, because it's so ephemeral and needs constant reinforcement. LLMs aren't learning machines, they're static ones.

reply

upvote

by irishcoffee1 hours ago|

[-]

How many facts change over time to create obsolete data? Unless you’re researching current events, I contend it’s a moot point.

reply

upvote

by michaelje13 hours ago|

[-]

[dead]

reply

upvote

by RataNova14 hours ago|

[-]

[dead]

reply