undefined

points

[-]

When kubernetes was released there were very few people who could run it, and even less that could run it usefully.

Right now there a few people who can run a 1T model at home, even less who can run a 5T model and probably single digits who can run a 10T model.

But if an open source 10T model was available you can be sure people would find new ways to quantize it, new ways to configure hardware and and new ways to think about problems that would make it useful.

1T+ models (Deepseek v4, Kimi K2.6 etc) are available as open weights now, and for ~$5000-$10000 you can run them usefully at home. 2 years ago no on was contemplating that.

$250K to run a 10T model might be possible now. There are many companies that will pay that, and that will push the tools and techniques downwards for the rest of us.

by verdverm15 hours ago|

parent|

[-]

case in point: https://spark-arena.com/leaderboard

by cortesoft14 hours ago|

prev|

[-]

> any AI system that runs on a computer that i do not control is by my definition not Open Source.

This is not true at all. It would be open source if you could download it and run it anywhere that is capable, and are free to move it and modify it as much as you want.

Just because you don't have a computer at home powerful enough doesn't mean it isn't open source.

by rustcleaner13 hours ago|

parent|

[-]

I think he means theoretically in possibility space, without relying on a based insider leaking a 'closed' frontier model to bittorrent or hyphanet.

by sheeshkebab16 hours ago|

prev|

[-]

Qwen models are actually very competitive with frontier models, and you can run them on your local computer. Gotta have a decent graphics card and by that time the current cost of the rig may not justify it over paying $100/month for cloud model but it’s all out there.

by nirui14 hours ago|

parent|

[-]

Qwen is still controlled by Alibaba, one company. We can't let the future be in the hands of a few companies, can we?

Fun fact: Qwen was not initially a Apache Licensed project, it was based on a custom license from Alibaba that restricts commercial use: https://github.com/QwenLM/Qwen/blob/ba2d85a13b28ed1ee0dde2d6.... There's no guarantee that they won't just switch it back later.

Kudos for them for switching to Apache License, of course. BUT, they're still a for-profit company. So as DeepSeek btw.

by rustcleaner13 hours ago|

parent|

prev|

[-]

>Gotta have a decent graphics card and by that time the current cost of the rig may not justify it over paying $100/month for cloud model but it’s all out there.

Never, ever, subscribe. When you subscribe, they win. They cornered the silicon market to force you to subscribe. Don't be a sub, or at least keep your sub tendencies in the bedroom. ;^)

by alecco7 hours ago|

parent|

prev|

[-]

Please don't over-promise. -- An AI open³ dev.

by NamlchakKhandro15 hours ago|

parent|

prev|

[-]

Fluctuating token costs make it worth it

by itkovian_16 hours ago|

prev|

[-]

Projects like pluralis agora solve this problem. Really what you want is the model to be collectively owned and governed, not local

by 15 hours ago|

prev|

[-]

deleted

by singpolyma316 hours ago|

prev|

[-]

LLMs that you can run locally on hardware that is not out of range to acquire is already a thing for some time.

by bitwize15 hours ago|

parent|

[-]

Recently I fired up Gemma4-26B-A4B on my 8-year-old PC... and it ran surprisingly well!

But I am going to need a much beefier machine to get it to the point where it can do any but very trivial dev tasks acceptably fast, and I'm going to need a much beefier model, perhaps one not so aggressively quantized, to keep it on task without the wheels completely falling off. Already we're talking serious money outlay, perhaps still within my programmer salary to accommodate, but just barely. And we're not even where near the performance characteristics a frontier model can support.

by verdverm15 hours ago|

parent|

[-]

DGX Spark runs this sized model (I personally like qwen36moe better than gemma4moe) at speeds fast enough for interactive coding sessions. Algorithmic advances like DiffusionGemma ~4x token gen speeds (https://deepmind.google/models/gemma/diffusiongemma/)

by matheusmoreira17 hours ago|

prev|

[-]

We can run open weight models on our own machines.

by em-bee16 hours ago|

parent|

[-]

yes, but a model that runs on my own machine will never have the capacity of a model that runs in a datacenter. as i said, it can't compete with that.

by thewebguyd16 hours ago|

parent|

[-]

If RAM prices ever come down, you can have a machine that can run a capable local model.

Qwen 2.5 72B is surprisingly capable, almost on par with GPT-4o if not a little better. You can run it on a 128GB Mac Studio with 8-bit quantization. You need about 77GB for the weights and ~15GB for your context window & cache.

Pricing remains to be seen, but there's also those new nvidia laptops coming out the surface laptop ultra should have 128GB RAM w/ Blackwell GPU, they're saying 1 petaflop of AI compute, if you can tolerate Windows (no idea if it'll boot Linux until the hardware is out).

These models are roughly ~1 year or less behind the frontier models. We really just need hardware to catch up and alleviate the price pressure on RAM.

by rustcleaner13 hours ago|

parent|

[-]

>If RAM prices ever come down

Maybe an unpopular opinion here (seening how Y-combinator is his baby), but I think OpenAI and Sam Altman should be financially decimated for cornering the DRAM market. What he's done is a step or two removed from what the Hunt brothers did. His buy-up of future DRAM silicon has measurably harmed personal computing, and he should not get to walk away with a 'win' from it.

by randbyte14 hours ago|

parent|

prev|

[-]

> a model that runs on my own machine will never have the capacity of a model that runs in a datacenter.

I don’t think so. A local run model only needs to serve one or a few people. It seems possible to run a DeepSeek v4 model at full capacity on a server costing 200k usd. Very expensive but not impossible.

Factor in hardware and software improvements over time, and the fact that most people may just need to run a smaller and quantized model, it should take a pc at 10k usd scale.

by mejutoco8 hours ago|

parent|

prev|

[-]

It also will not change arbitrarily. Different strokes for different folks.

by melozo15 hours ago|

prev|

[-]

Huh? Open source is a quality of the software, not specific to the hardware used to run the model. The demand is that model weights are openly available for anyone to run and fine tune without restriction. Has nothing to do with the hardware it runs on.

by ls61216 hours ago|

prev|

[-]

Call it open weights if you must. But even with OSS just because you have the source code doesn't mean your machine is high performance enough to run it usefully this has always been true.