Hell, an 8GB hard drive was unfathomable when I was a kid in the 90s. I remember getting a 30 megabyte drive for our Mac LC.
My childhood best friend and neighbour had the same kind of computer except they only had something like 384K of memory and I tried to convince them their computer was broken when it didn't count up all the way.
Mine had 4K of RAM and an 800 KHz CPU - and I was living in the future, man. No way I could use that much memory. After all I had to type in whatever program I wanted to run every time I turned it on. Then I got a manual audio cassette recorder and thought "Woah, I don't think it gets better than this!"
The same world where a classic Mac came with 128KB of RAM.
We might someday live in a world where entry-level is 256TB of RAM.
If Apple could go back in time 3.5 years and decide to build their own factory, that would put them in a great position today. But deciding to do it now won't increase their supply 3.5 years from now more than just increasing their long-term orders with existing suppliers. Those suppliers will start building new factories based on Apple's increased orders and they'll do it faster and cheaper than Apple can because they don't have to build some factories in the U.S. for political reasons or worry as much about environmental regulation, permitting and ensuring Apple employees in Penang get benefits similar to employees in Cupertino.
You're talking about the "best" things Apple could do with their money, in terms of investment returns, but I think that misses the point that Apple literally can't buy enough memory at any price.
One of the reasons Intel fell behind is that they couldn't give access to their competitors for business reasons, and therefore could never scale as high as TSMC could.
There are many other reasons, but accounting is a huge one. Unless there is a huge ROI or something else we don't otherwise know, I don't see Apple adding such expensive deprecating assets onto their books as chip fabs.
Apple’s not exactly famous for their low pricing on spec upgrades nor competing based on being the price leader…
Personally, I think there's no way memory heavy inference moves on-device (vs cloud) due to the economics, but it's not impossible technology + platforms go that way for currently unforeseeable reasons.
My non-tech friends and family would probably be served perfectly fine by local models today, if they had a working web search tool. Their queries are often “soft” and don’t have an exact answer. My mom and aunt used it to pick a hairstyle, my mom used it to get an image of what a room would look like with particular drapes in it, etc. Stuff I think mid-sized local models like Gemma or smaller Qwens could do without issue. They just don’t have a device that will run them.
Businesses won’t move. They need a huge context so they can stuff a bunch of Confluence pages in it and 300 tools and it needs to read an entire codebase and yada yada. The hardware depreciation and electricity will probably make it a net zero or even cost more than paying for API access.
But maybe that hardware becomes so commoditized that it's not difficult to obtain / stuff in a box.
In that world, a) we are already at or close to having enough memory in local devices to do inference locally, and b) that memory isn't inference-specific and can be utilized for other things. Most devices come with enough memory to do some level of inference, and some come with plenty (eg a gaming desktop probably has 32GB+ of RAM in it).
You aren't going to run Kimi on it, but I think the reality for a lot of consumer inference is that it doesn't need to be. It's going to be a lot of things that are soft, and easily answered by a search API, so the LLM really just needs to be able to skim and summarize. Going a step further, we may even see some kind of hybrid approach where a local OpenRouter kind of thing decides whether the task is soft enough to do locally with models that fit in RAM or if it needs to be farmed out to a PaaS provider.