Here and on AI tech subreddits (ones that aren’t specifically about local or FOSS) seem to have this dynamic, to the degree I’ve suspected astroturfing.
So it’s refreshing to see maybe that’s just a coincidence or confirmation bias on my end.
Thanks!
It makes using my Claude Pro sub actually feasible: write a plan with it, pick it up with my local model and implement it, now I'm not running out of tokens haha.
Is it worth it from a unit economics POV? Probably not, but I bought this thing to learn how to deploy and serve models with vLLM and SGLang, and to learn how to fine tune and train models with the 128GB of memory it gets to work with. Adding up two 40GB vectors in CUDA was quite fun :)
I also use Z.ai's Lite plan for the moment for GLM-5.1 which is very capable in my experience.
I was using Alibaba's Lite Coding Plan... but they killed it entirely after two months haha, too cheap obviously. Or all the *claw users killed it.
So I agree with you, its better than Sonnet but way cheaper. I do wonder how long that will last though
Most recently I used it to develop a script to help me manage email. The implementation included interacting with my provider over JMAP, taking various actions, and implementing an automated unsubscribe flow. It was greenfield, and quite trivial compared to the codebases I normally interact with, but it was definitely useful.
The TL;DR is that unless you are doing it as a hobby or working in an environment where none of the data privacy options supported by Anthropic/OpenAI (including running on Azure/Bedrock with ZDR) work for you then it's not worth it.
The best open models are around the Sonnet 4.6 level. That's excellent, but the level of tasks you can give to GPT 5.4 or Opus 4.6 is just so much higher it doesn't compare (and Opus 4.7 seems noticeably better in my few hours of testing too).
I have my own benchmarks, but I like this much under-publicized OpenHands page: https://index.openhands.dev/home
It shows for every task they test closed models do the best. The closest and open model gets is Minmax 2.7 on issue resolution where it's ~1% worse than the leaders.
That matches my experience - fine for small problems, but well behind has the task gets bigger.
When I argue this, my point is that FOSS shouldn't target the desktop with open weights - it should target H200s. Really big parameter models with big VRAM requirements.
Those can always be distilled down, but you can't really go the other way.
Subsidizing is the opposite of competing. It's literally the practice of underpricing your product to box out competition. If everyone was competing on a level playing field they would all price their products above cost.
All these tech oligarch asshat companies need to be regulated to hell and back.
For many things now you need to go local and in the future if you want any privacy you'll need to go local.
Big players operating at loss to distort the market is not a good thing overall.
It's not the smaller players spending billions on training data.