Open weights will always trail SOTA. Forever. So let's say they continue to get better every year. In 100 years, the open weight model will be 100x better than today. But the SOTA model will be 101x better. And still, people will make this argument that you should pay a premium for SOTA. Despite the open weights being 100x better than what we have today.
The open weights today are better than the SOTA models from a year ago. Yet people were using the SOTA models for coding a year ago. If people used SOTA models a year ago, then it was good enough, right? So why isn't the same (or better) good enough now?
The answer is: it is good enough. But people are irrationally afraid of missing out (FOMO). They're not really using their brains. They're letting fear lead their decisions. They're afraid "something bad" will happen if they don't use the absolute latest model. Despite the repeatable, objective benchmarks telling us all that open weights are perfectly capable of doing real work today, the fear is that we're missing out on something better. So people throw away their money and struggle with rate-limits because of their fear.
I'm not sure how much I trust those benchmarks; I have a feeling everyone is playing up to them in some way. Still, if you're willing to accept the latency, they're definitely usable.
Of course everyone has realized this, so the hardware you need to run them is a little bit on the expensive side right this minute.
CPU manufacturers are working on improvements so that you can more practically run models on regular CPU+RAM (it's already possible with llama.cpp, just even slower).
The GPU also takes around $500-$1000 in electricity, and even then you won't be able to run a model of as good quality as anthropic.
It's also hard to justify since who knows how quickly it will be outdated, like maybe soon you'll need a blackwell chip (like a $100k PC, check out the NVIDIA DGX Station) to run a decent model.
... It'll take a lot more than a year to pay back a model capable of running openclaw with any sort of reasonable performance.
Or can you report that you've had good luck with a Strix Halo or local GPU for less than $40k up-front costs?