One set of models run on 8GB VRAM / 16GB RAM and another set runs on 24GB VRAM / 64GB RAM. Both are very useful for easy and easy-to-moderate complex code, respectively.
The latest open, small models are incredibly useful even at smaller sizes when configured properly (quant size, sampling params, careful use of context etc).
That's a very strange comment. Why would anyone run a dense model on a low-end computer? A 8B model is only going to make sense if you have a dGPU. And a Qwen3.6 or Gemma4 MoE aren't going to be “beaten the hell out” for most tasks especially if you use tools.
Finally, over the lifetime of your computer, your ChatGPT subscription is going to cost more than the cost of your reference computer! So the real question should be whether you're better off with a $1000 computer and a ChatGPT subscription or with a $2000 computer (assuming a conservative lifetime of 4 years for the computer).
My Strix Halo desktop (which I paid ~1700€ before OpenAI derailed the RAM market) paired with Qwen3.5 is a close replacement for a $200/month subscription, so the cost/benefit ratio is strongly in favor of the local model in my use case.
The complexity of following model releases and installing things needed for self-hosting is a valid argument against local models, but it's absolutely not the same thing as saying that local models are too bad to use (which is complete BS).
Biggest risk I see is Nvidia having delays / bad luck with R&D / meh generations for long enough to depress their growth projections; and then everything gets revalued.