upvote
They are good for small tasks but you would not be able to use it like you use Claude and most likely be disappointed. But also, I do not know how you use claude.

There are many services online which offer hosted services for these models, my advice for anyone who is thinking about buying hardware to self host this is to try those first, that way you can get an impression of the capabilities and limitations of those models before you commit to buying hardware

reply
Best way to find out is to buy $10 of OpenRouter credits and try the models for yourself.

From my experience doing this, they're nowhere close, but it's entertaining to check in once in a while.

reply
I've been playing with the open models since the original llama leak. They're getting better over time, are useful for tasks of moderate complexity and it's just cool to have a binary blob of knowledge that you can run locally without an internet connection.

However you should manage your expectations. Whatever the benchmarks say, you'll quickly realise they're not at all competing with Sonnet let alone Opus. Even the largest open weights models aren't really doing that.

reply
So far, I’ve found gpt-oss-20B to be pretty good agentic wise, but it’s nothing like Claude Code using its paid models.

(I haven’t tried the 120B, which I’ve read is significantly better than 20B)

reply