upvote
Apple adds matmul acceleration to A19 Pro GPU
>, making local LLMs far more viable on future Macs

If you want to run on a Mac Pro or iMac, this will be fine, but at the price points, youd be silly to spend money on either when you can do a dual nvidias for the same ram, and that will be dedicated ram.

For portable Apple devices, the max memory you can get currently is 24GB IIRC and thats probably not going to change any time soon. The only decent model that can run locally is Gemma 27B QAT which will eat up 17gb at the minimum, and that model really struggles with some stuff that you can do for free on ChatGPT or Gemini

So yeah, speed is not gonna matter when results are shit.

reply
Please don't Ask as a blogging platform. Just submit a link to an informative article, or comment on an existing article.

https://news.ycombinator.com/newsfaq.html

reply
The only official mention I've seen so far was a brief mention in a slide during the presentation.

Useful /r/LocalLlama discussion: https://www.reddit.com/r/LocalLLaMA/comments/1ncprrq/apple_a...

reply
deleted
reply
The first SoC including Neural Engine was the A11 Bionic, used in iPhone 8, 8 Plus and iPhone X, introduced in 2017. Since then, every Apple A-series SoC has included a Neural Engine.
reply
[dead]
reply