undefined

points

by bakies7 hours ago |

comments

by smcleod6 hours ago|

[-]

Try 27b, it's significantly smarter than 35b-a3b (although it is slower, it's not so bad with MTP).

by ignoramous6 hours ago|

parent|

[-]

At least according to gertlabs, Qwen3.6 27B outperforms every SoTA (closed) model at Kotlin: https://archive.vn/RYBCL / https://gertlabs.com/rankings?mode=agentic_coding&language=k...

by iosjunkie5 hours ago|

parent|

[-]

Interesting. I wonder if there is opportunity to train a set of small model variants to excel at a certain stacks. Eg Qwen3.6-27B for Node + React or Qwen3.6-27B for Rust + TUI

by mft_2 minutes ago|

parent|

[-]

This is always how I've imagined small/consumer-hardware models going in time. If I only ever code in Python, give me a model that does just that (plus some general CS, algorithms, structure, etc.) and does it super-fast and well. Make it small enough that if I need a Python back end and an HTML front end, another specific model can load alongside and collaborate on the front end.

Or give me a pure shopping model that has a general understanding of products and product categories, and then will playwright/scrape/API into shopping sites to compare options and find me what I want. Etc.

by gertlabs2 hours ago|

parent|

prev|

[-]

Qwen 3.6 27B is an anomalously strong all-around model for its size, but when we run our evaluations, we generate 10 coding submissions/language/model (110 total). So full discosure, the per-language per-model performances can be noisy (I do not think Qwen3.6 27B is better than Fable 5 in agentic workflows when writing Kotlin, given enough samples, although we do find some interesting anomalies that hold up under large sample sizes).

by bakies4 hours ago|

parent|

prev|

[-]

Hmm, I just assumed bigger was better. How's it different?

by Lalabadie4 hours ago|

parent|

[-]

Off the top of my head since it seems to be the quick info you're looking for: IIRC, with these two, the 27B is a dense model, meaning it's all active at inference. Meanwhile, the 35B is a Mixture of Experts (MoE), so only part of its network (3B?) is active at any time.

by bakies4 hours ago|

parent|

[-]

Thanks! Dense models have been slow on my compute, but I'll give it a try. If its not toooooo slow then it's fine I mostly fire and forget agents anyway.

Edit: seems fast! I'll try it out some more, thanks again.

by diseasedyak5 hours ago|

prev|

[-]

I'm running qwen36.:35b:iq4 IQ4_XS quant. Takes 18 GB of RAM with 131k context window. Seems to be really good. Have it running local stuff via Hermes, using a cloud model via Ollama (Deepseek V4-Pro) for heavy lifting.

by tarruda4 hours ago|

prev|

[-]

If your framework desktop is the 128G Strix Halo, I recommend giving Qwen 3.5 122B-A10B a shot.

This Q5_K_M quant should be near lossless and fit with full 256K context in about 100GB of RAM: https://huggingface.co/AesSedai/Qwen3.5-122B-A10B-GGUF

by Catloafdev4 hours ago|

parent|

[-]

3.6 scores better on coding across the board.

Edit: specifically Qwen 3.6 27B beats that on coding and agentic workflows.

by bakies3 hours ago|

parent|

prev|

[-]

I'll keep this in mind.

by andy996 hours ago|

prev|

[-]

Could you please share which coding agent you are using with it?

by waezel5 hours ago|

parent|

[-]

Crush: https://github.com/charmbracelet/crush/

The Q8_K_XL MTP model from Unsloth: https://huggingface.co/unsloth/Qwen3.6-35B-A3B-MTP-GGUF

by bakies4 hours ago|

parent|

prev|

[-]

I settled on opencode after trying goose and aider as well. I'll probably try some more but opencode worked similar to Claude code which is my main agent.

I serve the model with ollama and am thinking about replacing ollama but haven't looked into it.

I have openwebui for chat if I want that too, but don't really use it.

by oneshtein5 hours ago|

parent|

prev|

[-]

npx @oh-my-pi/pi-coding-agent

by npodbielski6 hours ago|

parent|

prev|

[-]

I am using Mistral Vibe.

by NamlchakKhandro6 hours ago|

parent|

prev|

[-]