upvote
What is everyone running DeepSeek v4 Flash with?!

It’s currently unsupported on Llama.cpp and vllm doesn’t support GPU+CPU MoE, so unless all of you have an array of DGX Sparks in your bedroom, what’s the secret sauce?!

reply
https://www.github.com/antirez/ds4 (from Antirez of Redis fame) runs a 2-bit quant on Apple Silicon hardware and 96GB or 128GB RAM.
reply
Just because you read it on a github repo doesn't make it true, it also doesn't take into account cpu temps and inevitable throttling you'll encounter.
reply
i ran it on my own device haha

i don't comprehend why people are in such disbelief at how much better this stuff runs on a mac studio than on NVIDIA hardware with 1/5th the VRAM. look, what can i say? NVIDIA is a bigger rip off than Apple is!

reply
Which is good, because Nvidia pulling a Micron and ceasing consumer hardware production is right around the corner.
reply