undefined

points

[-]

Curious how do you run opencode and qwen locally? Few times I tried it responds back with some nonsense. Chat, say, through ollama works well.

by syntaxing2 hours ago|

parent|

[-]

Which quants are you using? I had similar issue until I used Unsloth’s. I would recommend at least UD_6. Also, make sure your context length is above 65K.

https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF

by Someone123418 hours ago|

prev|

[-]

Thanks I appreciate the info. I may try to spin up something like this and give it a whirl.

by anon37383916 hours ago|

parent|

[-]

I would recommend trying oMLX, which is much more performant and efficient than LM Studio. It has block-level KV context caching that makes long chats and agentic/tool calling scenarios MUCH faster.