undefined

points

[-]

I am trying to figure this out too... what I am seeing is that the local models like Qwen 3.5 family that fit on hardware like yours handle ambiguity poorly. But are capable of emitting complete apps too.

That, and they have tool use issues.... https://www.reddit.com/r/LocalLLM/comments/1smzw6s/qwen35_a3...

I would check out the model mentioned in that thread, GGUF unsloth/qwen3.5-35b-a3b on Q4_K_M

by frabcus14 hours ago|

parent|

[-]

Qwen 3.6 is out now and a touch better than 3.5.

I'm finding Google's Gemma 4 even better though - seems to hold up the agentic loop better than Qwen.

All will load into 20Gb of VRAM. None are amazing, but they do just about work.