upvote
Which Opus?

GLM-5.2 is already close to Opus-4.7 level:

https://aibenchy.com/compare/anthropic-claude-opus-4-7-mediu...

reply
Oh, or you meant a smaller model than GLM-5.2 with similar capabilities?
reply
Probably not. Qwen3.(5|6)-27B seems like an "accidental freak". I'm not even sure they know what they did to create that. A decent amount of the team members left after that, so unfortunately, we might not be seeing another small model that packs such a punch for a while. Hopefully the team is studying their entire training recipe for that and is able to replicate. If they are, then a 50-70B dense model might give us such capabilities...
reply
Gemma 4 is competitive with Qwen 3.6. I had vague feelings that Qwen was better at coding tasks, based on anecdotes and public benchmarks, but I've been doing some benchmarking lately, and Gemma 4 31b is consistently beating Qwen 3.6 at the really hard stuff (finding hard security bugs, vision tasks for fixing UI layout or categorizing assets, in particular..and for vision, nothing self-hostable beats Gemma 4 12b, including 31b).

I'm still hoping for a bigger Gemma 4 version, but I think they may be worried about competing with their own hosted models, since Gemma 4 is already better than a lot of Google's proprietary models that are still available in AI Studio.

But, it is a shame that Qwen won't probably be doing more open models going forward. It is really strong for its size.

reply
Yep! I'm running things locally on a RTX5080 + RTX1060 + 64GB DDR5 ram, and would love to get a more capable model if possible!

QWEN3.6 27b is pretty good, but i can still notice some spots where it's not as good as the frontier models.

reply
Why wait for the next few months? There are plenty of better models that you can run today locally. Qwen3.5-397B beats Qwen3.6-27B. MiniMax2.7 is a longrun horizon monster. (I haven't given 3 much of a try yet). KimiK2.6/2.7, MiMoV2.5/MiMoV2.5-Pro and GLM5.1 will wreck Qwen3.6-27B any day on any task.
reply
[dead]
reply