Are these unified memory Macs and giant 24GB desktop GPUs achieving dozens or hundreds of tokens per second commensurate with their 10x-20x cost?
The full 128GB is surely helpful in keeping browsers, editors and other things running since even 20-35GB models + k/v caches can eat up a lot of the core 64GB in my experience.