If you want to run on a Mac Pro or iMac, this will be fine, but at the price points, youd be silly to spend money on either when you can do a dual nvidias for the same ram, and that will be dedicated ram.
For portable Apple devices, the max memory you can get currently is 24GB IIRC and thats probably not going to change any time soon. The only decent model that can run locally is Gemma 27B QAT which will eat up 17gb at the minimum, and that model really struggles with some stuff that you can do for free on ChatGPT or Gemini
So yeah, speed is not gonna matter when results are shit.
The mac ultra has up 512gb and while expensive has more than twice the memory than any GPU alternative on a similar price point.
What was dragging it behind was the lack of Matmul acceleration, which seems that will change soon. Likely nvidia cards will still be faster and have better support, but paying a very big premium for it (ironic that apple is the cheaper option for once)
It isn’t cheap, but you can buy a 16 inch MacBook Pro with 128GB unified memory today.
Useful /r/LocalLlama discussion: https://www.reddit.com/r/LocalLLaMA/comments/1ncprrq/apple_a...