I can’t figure out when it makes sense to pay 10k up front for a quantized Llama 3.1 but it’s an interesting option
But yeah, there's a bit of a dearth of models that could fully utilize memory in the 128-256GB bracket at the moment. But things move so fast in this space, I wouldn't base my decision on a generation of models that's just a few months old.