upvote
For coding assistance, I have tried OpenCode with several large open models through OpenRouter. All were fairly bad compared to Claude Opus. Could you provide some hints on how I should be holding these open models so that I might get more value out of them?

I agree with the common trope that open models lag behind by about a year, but something magical happened just around a year ago when the state of the art models became extremely useful. By this reasoning we're about to see open models perform well, but I'm afraid there is more to it than just waiting for another revolution around the sun.

Note, my application is coding assistance. Open models can be great for other purposes.

reply
I tried almost all OS models on opencode, none of them is on levels as opus 4.7.

In latest experiment I used opus for implementation plan then used cursor composer 2.5 for execution.

I must say that combo is really good. Main drawback of claude code is that is super slow. So when paired with composer that is super fast it flies.

reply
> I don't see the business model working.

Same. It's a nightmare from a Porter's Five Forces perspective.

There will be a ton of businesses competing in this space, and there will be something of a moat due to how capital intensive the business can be, but there will still basically be infinite competitors.

Great for consumers.

reply
For coding you always want to go with the best model in the category, not something that would be the best model if we went 1 year back which GLM 5.1 is, and I'm saying that as a big fan of GLM cause I run a translation site where GLM is good enough for the price.

Most of the money right now is in coding. Openai and Anthropic just have to be 6 months ahead of SOTA open source models and they'll capture most of the enterprise and dev market

reply
Yes I'm an engineer (20 years most in games/graphics industry) and only use it for code. I've been using glm 5.1 this week a lot. I went in expecting another "decent" but not really "up to standard" open source model.

I highly doubt I'll ever use Claude again.

I think you are wrong about Claude being any significant level better

reply
I've been mostly coding with GLM-5.1 as well and I agree with you. DeepSeek V4 Flash is another very good surprise. Incredibly cheap, fast and effective.
reply
> For coding you always want to go with the best model in the category

Will this always be true? There will never be an event horizon/point of diminishing returns where something not-bleeding-edge is "good enough" for 51%+ of users?

reply
For coding like for everything else in life cost is a factor.
reply
Cost for the value delivered. Like if you offered the current SOTA open source models at $0.1/M, I still think I'd be using Opus or 5.5 at $30/M. Or say GPT 5 which was released Aug 25, I don't think I'd use it for coding for even $0.1. I'd def find other uses for it(translations, agentic workflows, prompt guards etc), but for coding I don't think I'd ever completely switch to a SOTA open model

Unless ofc there was an actual speed difference, only reason I'd be willing to go with a worse model couple of percent worse than current best model is if the speed was at least 5x higher. Looking forward to kimi k2.6 offered publicly by Cerebras

reply
> I still think I'd be using

That's fine. Other people may not want to pay 300 more and will rather make do with last year's SOTA.

> For coding you always want to go with the best model

Maybe you meant "For coding I always want to go with the best model"?

reply
> For XXX you always want to go with XXX, not XXX

Oh, hey, I recognize you. Thank you for the very forward and thorough orbital sander recommendation at Home Depot. That's exactly what I wanted to deal with on my holiday weekend. You just know so much about this and the rest of us are simple passersbys.

reply
This is a silly take. There is a line of "good enough" for most coding (most CRUD apps and APIs are nothing special), and once we are past that, nobody will care about having the "newest, best" model except extreme outliers. And this base "good enough" model will become an ultra cheap commodity as we already see with GLM, deepseek, etc.
reply
Most work is not coding.

And also, people have it wrong… their models are not the main problem anymore. It’s the RAG

reply
Depending on RAG is a workflow problem, not an AI problem
reply