upvote
I tried running any model on my 1070 and it instantly crashes my old tower, probably time to get off windows and run linux on it.
reply
Understated how much of a boon for Linux that AI development has been.

There isn’t any benefit to running a windows machine.

reply
Au contraire, I run models on WSL and my desktop reliably wakes up from sleep. Best of both worlds.
reply
Watch out for VRAM/RAM. When programs on WSL tired to take more than there was available it was crashing my WSL hard and even corrupting files on the virtual drive attached to WSL in a very strange manner that made recent files just disappear. I had at leat two projects that had to be rebuilt from past conversation with AI because all their files were gone after few crashes.

I asked codex to write WSL config to prevent crashes. It put some limits on WSL and the situation stabilized, but I lost all trust for WSL anyways.

reply
What are the config settings you needed?

TIA

reply
Sounds like a hardware issue, though NVIDIA driver issues can't be ruled out, they're much rarer these days
reply
Mind sharing your llama.cpp settings for that?
reply

  .\llama-server.exe -m ..\Qwen3.6-35B-A3B-UD-Q4_K_M.gguf -ngl 999 --n-cpu-moe 41 -c 262144 --port 8081 --flash-attn on --cache-type-k turbo4 --cache-type-v turbo3 --no-mmap --mlock --host 0.0.0.0 -t 8 -tb 8 -np 1
Using this llama.cpp fork https://github.com/TheTom/llama-cpp-turboquant and mostly copying from this video https://www.youtube.com/watch?v=8F_5pdcD3HY

Haven't had much time to test it other than asking a few questions & changing some HTML in cline so it might be thick as a brick for all I know, but still worth trying

reply
I just tested it with some risc-v code and it wrote down a "mov" instruction several times.. yeah something needs tuning maybe
reply