upvote
Curious how do you run opencode and qwen locally? Few times I tried it responds back with some nonsense. Chat, say, through ollama works well.
reply
Which quants are you using? I had similar issue until I used Unsloth’s. I would recommend at least UD_6. Also, make sure your context length is above 65K.

https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF

reply
Thanks I appreciate the info. I may try to spin up something like this and give it a whirl.
reply
I would recommend trying oMLX, which is much more performant and efficient than LM Studio. It has block-level KV context caching that makes long chats and agentic/tool calling scenarios MUCH faster.
reply