undefined

points

by Pragmata8 hours ago |

comments

by XCSme7 hours ago|

[-]

Which Opus?

GLM-5.2 is already close to Opus-4.7 level:

https://aibenchy.com/compare/anthropic-claude-opus-4-7-mediu...

by XCSme7 hours ago|

parent|

[-]

Oh, or you meant a smaller model than GLM-5.2 with similar capabilities?

by segmondy6 hours ago|

parent|

[-]

Probably not. Qwen3.(5|6)-27B seems like an "accidental freak". I'm not even sure they know what they did to create that. A decent amount of the team members left after that, so unfortunately, we might not be seeing another small model that packs such a punch for a while. Hopefully the team is studying their entire training recipe for that and is able to replicate. If they are, then a 50-70B dense model might give us such capabilities...

by SwellJoe26 minutes ago|

parent|

[-]

Gemma 4 is competitive with Qwen 3.6. I had vague feelings that Qwen was better at coding tasks, based on anecdotes and public benchmarks, but I've been doing some benchmarking lately, and Gemma 4 31b is consistently beating Qwen 3.6 at the really hard stuff (finding hard security bugs, vision tasks for fixing UI layout or categorizing assets, in particular..and for vision, nothing self-hostable beats Gemma 4 12b, including 31b).

I'm still hoping for a bigger Gemma 4 version, but I think they may be worried about competing with their own hosted models, since Gemma 4 is already better than a lot of Google's proprietary models that are still available in AI Studio.

But, it is a shame that Qwen won't probably be doing more open models going forward. It is really strong for its size.

by Pragmata7 hours ago|

parent|

prev|

[-]

Yep! I'm running things locally on a RTX5080 + RTX1060 + 64GB DDR5 ram, and would love to get a more capable model if possible!

QWEN3.6 27b is pretty good, but i can still notice some spots where it's not as good as the frontier models.

by segmondy6 hours ago|

prev|

[-]

Why wait for the next few months? There are plenty of better models that you can run today locally. Qwen3.5-397B beats Qwen3.6-27B. MiniMax2.7 is a longrun horizon monster. (I haven't given 3 much of a try yet). KimiK2.6/2.7, MiMoV2.5/MiMoV2.5-Pro and GLM5.1 will wreck Qwen3.6-27B any day on any task.

by Pragmata6 hours ago|

parent|

[-]

[dead]