I've been happy with an OEM Spark (128G), enough so that I picked up a second one. Have 2x qwen and 1x gemma (both at 8bit and full context), plus embedding, Re-Ranker, and a 1.7B for little things. Running 6x models, probably going to add STT here soon, want to try talking more than typing.
The caveat is that if you try to use multiple models on the same device at the same time, you thrash and destroy tok/s