upvote
Try 27b, it's significantly smarter than 35b-a3b (although it is slower, it's not so bad with MTP).
reply
At least according to gertlabs, Qwen3.6 27B outperforms every SoTA (closed) model at Kotlin: https://archive.vn/RYBCL / https://gertlabs.com/rankings?mode=agentic_coding&language=k...
reply
Interesting. I wonder if there is opportunity to train a set of small model variants to excel at a certain stacks. Eg Qwen3.6-27B for Node + React or Qwen3.6-27B for Rust + TUI
reply
This is always how I've imagined small/consumer-hardware models going in time. If I only ever code in Python, give me a model that does just that (plus some general CS, algorithms, structure, etc.) and does it super-fast and well. Make it small enough that if I need a Python back end and an HTML front end, another specific model can load alongside and collaborate on the front end.

Or give me a pure shopping model that has a general understanding of products and product categories, and then will playwright/scrape/API into shopping sites to compare options and find me what I want. Etc.

reply
Qwen 3.6 27B is an anomalously strong all-around model for its size, but when we run our evaluations, we generate 10 coding submissions/language/model (110 total). So full discosure, the per-language per-model performances can be noisy (I do not think Qwen3.6 27B is better than Fable 5 in agentic workflows when writing Kotlin, given enough samples, although we do find some interesting anomalies that hold up under large sample sizes).
reply
Hmm, I just assumed bigger was better. How's it different?
reply
Off the top of my head since it seems to be the quick info you're looking for: IIRC, with these two, the 27B is a dense model, meaning it's all active at inference. Meanwhile, the 35B is a Mixture of Experts (MoE), so only part of its network (3B?) is active at any time.
reply
Thanks! Dense models have been slow on my compute, but I'll give it a try. If its not toooooo slow then it's fine I mostly fire and forget agents anyway.

Edit: seems fast! I'll try it out some more, thanks again.

reply
I'm running qwen36.:35b:iq4 IQ4_XS quant. Takes 18 GB of RAM with 131k context window. Seems to be really good. Have it running local stuff via Hermes, using a cloud model via Ollama (Deepseek V4-Pro) for heavy lifting.
reply
If your framework desktop is the 128G Strix Halo, I recommend giving Qwen 3.5 122B-A10B a shot.

This Q5_K_M quant should be near lossless and fit with full 256K context in about 100GB of RAM: https://huggingface.co/AesSedai/Qwen3.5-122B-A10B-GGUF

reply
3.6 scores better on coding across the board.

Edit: specifically Qwen 3.6 27B beats that on coding and agentic workflows.

reply
I'll keep this in mind.
reply
Could you please share which coding agent you are using with it?
reply
I settled on opencode after trying goose and aider as well. I'll probably try some more but opencode worked similar to Claude code which is my main agent.

I serve the model with ollama and am thinking about replacing ollama but haven't looked into it.

I have openwebui for chat if I want that too, but don't really use it.

reply
npx @oh-my-pi/pi-coding-agent
reply
I am using Mistral Vibe.
reply