Everyone who's betting their competency on the generosity of billionaires selling tokens for 1/10-1/20th of the cost, or a delusional future where capable OS models fit on consumer grade hardware are actually cooked.
Of course there will always be larger flagship models, but if you can count on decent on-device inference, it materially changes what you can build.
Why?
The trend is heading in the opposite direction, less options for strong consumer hardware and towards cloud based products. This is a memory issue more than anything. Nvidia is done selling their ddr7 to gamers and people with AI girlfriends.
It's not out of the realm of possibility, but I just want to make you aware that this would be a very surprising development in computing history.
> in the next few years a "good enough" model will run on entry-level hardware
And that's for laptops with unified memory. In the desktop space, 8GB discrete GPUs are going to be sticking around for a very long time.
But that's not my main argument is that its delusional for OP thinks its reasonable to expect that soon we'll be able to run models on consumer hardware that will be able to build basically most things,
But I do think there will be many compromises made for consumer electronics, I don't think the powers that be are eager to give consumers all the best memory (that should be clear by now) There's 3 DDR5 DRAM manufactures in the world that have to provide memory to all the world's militaries, governments, datacenters/corporations. Consumers are last priority.
You can argue whether the projection is too optimistic or not, but this project definitely made me a little bit optimistic on that end.
An example is https://blog.can.ac/2026/02/12/the-harness-problem/ for just improving edits.
Or if we could really steer these open source models using well structured plans, could we spend more time planning into a specific way and kick off the build over night (a la the night shift https://jamon.dev/night-shift)
They said the same thing about open source chess engines.
48 gb is enough for a capable LLM.
Doing that on consumer grade hardware is entirely possible. The bottleneck is CUDA and other intellectual property moats.