Intel should have shipped their GPUs with much more VRAM from day one. If they had done this, they'd have carved out a massive niche and much more market share, and it would have been trivially simple to do.
AMD should have improved their tools and software, etc.
Apple should have done as you say.
Google had nigh on a decade to boost TPU production, and they're still somehow behind the curve.
Such a lack of vision. And thus Nvidia is, now quite durably, the most valuable company in the world. Imagine telling that to a time traveler from 2018.
And as of now I do believe AMD is in the second strongest position in the datacenter space after Nvidia, ahead of even Google.
For money, probably.
Apple is presumably leaving a lot of money on the table by not trying to sell Apple Silicon for AI inference and training. They're the only ones who can attach reasonably large GPUs (M3 Ultra) to very large amounts of cheaper memory (512GB SO-DIMM per GPU). Apple could e.g. sell server SKUs of Mac Studios, heck they can sell M3 Ultra chips on PCIe cards. And they could further develop Apple Silicon in that direction. Presumably they would be seen as a very legit competitor to Nvidia that way, perhaps moreso than Intel and AMD. I'd assume that in the current climate this would be extremely lucrative.
Now, actually doing this would disrupt Apple's own supply chain as well as force it to spend significant internal resources and cultural change for this kind of product line. There's a good argument to be made it would disproportionally negatively affect its Mac business, so this would be a very risky move.
But given that AI hardware is likely much higher margin than the Mac business an argument could probably (sadly) be made that it'd be lucrative for them to try it. I personally don't think Apple is inclined to take this kind of risk to jeopardize the Mac, but I'm sure some people at Apple have considered this.
From inside news: They were not breaking even on their existing GPUs. The strategy was to take a loss just to have a presence in the space.
Nvidia is the most valuable company in the world right up until the AI bubble pops. Which, while it's hard to nail down when, is going to happen. I wouldn't call their position durable at all.
For all the faults of them leaning in hard on these things for stock market and personal gains, Nvidia still has some of the best quality products around. That is their saving grace.
They will not be the world most valuable company once the bubble pops, will probably never get back there again, but they will continue to be a decent enough business. I just want them going back to talking about graphics more than AI again, that will be nice.
As handwriting code is rapidly going out of fashion this year, it seems likely AI is coming for most of knowledge work next.
And who is to say that manual labor is safe for long?
They want to be able to sell handsets, desktops and laptops to their customer base.
Pursing a product line that would consume the finite amount of silicon manufacturing resources away from that user base would be corporate suicide.
Even nvidia has all but dropped support for its traditional gaming customer base to satisfy its new strategy.
At any rate, the local inference capabilities are only going to get cheaper and more accessible over the coming years, and Apple are probably better placed than anyone to make it happen.
But for some reason Apple thought the sound recording engineer or the video editor market was more important... like, WTF dude? Have some vision at least!
Sound recording engineers and video editors will not disappear after the AI bubble bursts, and Apple is wise to keep that market. Bursting the AI bubble will not make AI disappear, it will just end the crazy cashflows we are seeing now. And in that regard, with the capabilities of their hardware, Apple is in a pretty good spot I think.
Even if Apple had an amazing GPU for AI it wouldn’t matter hugely - local inference hasn’t taken off yet and cloud inference and training all uses servers where Apple has no market share and wasn’t going to get it since people had already built all the stacks around CUDA before Apple could even have awoken to that.
$1t backlog in orders in next 2 years.
Remember when a $1 billion valuation used to be a big thing? That is nothing compared with nowadays.
Obviously Siri from WWDC 2yrs ago was a disaster for Apple. Other than that they seem to have done pretty well navigating the new LLM world. I do think they would benefit from having their own SOA LLM, but I don’t think its is necessary for them. My mental model for LLMs and Apple is that they are similar Garage Band - “Now everyone can play an instrument” becomes “now anyone can make an app”. Apple owns the interface to the user (i don’t see anyone making nicer to use consumer hardware) and can use what ever stack in the background to deliver the technical features they decide to.
Inference has never been an issue for M series, and MLX just ramped it up further.
You can do training on the latest MBPs, although any serious models you are going to the cloud anyway.
What are they wasting, exactly?
Apple is counting on something else: model shrink. Every one is now looking at "how do we make these smaller".
At some point a beefy Mac Studio and the "right sized" model is going to be what people want. Apple dumped a 4 pack of them in the hands of a lot of tech influencers a few months back and they were fairly interesting (expensive tho).
The most powerful AI interactions I've had involved giving a model a task and then fucking off. At that point, I don't actually care if it takes 5 minutes or an hour. I've cued up a list of background tasks it can work on, and that I can circle back to when I have time. In that context, smaller isn't even the virtue at hand–user patience is. Having a machine that works on my bullshit questions and modelling projects at one tenth the speed of a datacentre could still work out to being a good deal even before considering the privacy and lock-in problems.
Claude and Kagi Assistant. I tried tooling up a multi-model environment in Ollama and it was annoying. It's just searching the web, building models and then running a test suite against the model to refine it.
It's pretty clear that this isn't going to happen any time soon, if ever. You can't shrink the models without destroying their coherence, and this is a consistently robust observation across the board.
Smaller models have gotten much more powerful the last 2 years. Qwen 3.5 is one example of this. The cost/compute requirements of running the same level intelligence is going down
The inputs are parsed with a large LLM. This gets passed on to a smaller hyper specific model. That outputs to a large LLM to make it readable.
Essentially you can blend two model type. Probabilistic Input > Deterministic function > Probabilistic Output. Have multiple little determainistic models that are choose for specific tasks. Now all of this is VERY easy to say, and VERY difficult to do.
But if it could be done, it would basically shrink all the models needed. Don't need a huge input/output model if it is more of an interpreter.
Give every iPhone family a in house Siri that will deal with canceling services and pursuing refunds.
Your customer screw up results in your site getting an agent drive DDOS on its CS department till you give in.
Siri: "Hey User, here's your daily update, I see you haven't been to the gym, would you like me to harass their customer service department till they let you out of their onerous contract?"
I don’t need the latest and greatest and I fine tuned LM studio enough that I get acceptable results in 30 to 90 seconds that help me keep moving ahead. I am not a software engineer, I am definitely not as much of a “coder” as the average person on HN. So if I can do it for less than $2000, I bet a lot of (smarter/experience coding) people could see great results for $5000.
You can get an M3 ultra Mac studio with 96gb ram for $4000. If you’re willing to go up to $6k it’s 256gb. Wayyyyy more firepower than my setup. I imagine plenty powerful for a lot of people.
For multi-gpu you can network multiple Macs at high speed now. Their biggest disadvantage to Nvidia right now is that no one wants to do kernel authoring in Metal. AMD learned that the hard way when they gave up on OpenCL and built HIP.