> I work 128GB M4 but not run inference, just too many electron apps running at the same time.
This is somewhat depressing - needing a couple of thousand bucks worth of ram just to run your chat app and code/text editor and API doco tool and forum app and notetaking app all at the same time...
Inference will be dirt cheap for things like coding but you'll want much more compute for architectural planning, personal assistants with persistent real time "thinking / memory", as well as real time multimedia. I could put 10 M4s to work right now and it won't be enough for what I've been cooking.