upvote
Just about everybody who isn't Nvidia dropped the ball, bigtime.

Intel should have shipped their GPUs with much more VRAM from day one. If they had done this, they'd have carved out a massive niche and much more market share, and it would have been trivially simple to do.

AMD should have improved their tools and software, etc.

Apple should have done as you say.

Google had nigh on a decade to boost TPU production, and they're still somehow behind the curve.

Such a lack of vision. And thus Nvidia is, now quite durably, the most valuable company in the world. Imagine telling that to a time traveler from 2018.

reply
I think for AMD, they were focused on competing against Intel. Remember AMD was almost bankrupt about 15 years ago because of competing against Intel. But the very first GPU use for AI was actually with an ATI/AMD GPU, not an Nvidia one. Everyone thinks Nvidia kicked off the GPU AI craze when Ilya Sutskever cleaned up on AlexNet with an Nvidia GPU back in 2012, or when Andrew Ng and team at Stanford published their "Large Scale Deep Unsupervised Learning using Graphics Processors" in 2009, but in 2004, a couple of Korean researchers were the first to implement neural networks on a GPU, using ATI Radeons: https://www.sciencedirect.com/science/article/abs/pii/S00313...

And as of now I do believe AMD is in the second strongest position in the datacenter space after Nvidia, ahead of even Google.

reply
Why should Apple have done this? It doesn’t fit their business in anyway shape or form. Where does data centre hardware sit relative to electronics / humanities cross roads that is foundational for Apple?
reply
> Why should Apple have done this?

For money, probably.

Apple is presumably leaving a lot of money on the table by not trying to sell Apple Silicon for AI inference and training. They're the only ones who can attach reasonably large GPUs (M3 Ultra) to very large amounts of cheaper memory (512GB SO-DIMM per GPU). Apple could e.g. sell server SKUs of Mac Studios, heck they can sell M3 Ultra chips on PCIe cards. And they could further develop Apple Silicon in that direction. Presumably they would be seen as a very legit competitor to Nvidia that way, perhaps moreso than Intel and AMD. I'd assume that in the current climate this would be extremely lucrative.

Now, actually doing this would disrupt Apple's own supply chain as well as force it to spend significant internal resources and cultural change for this kind of product line. There's a good argument to be made it would disproportionally negatively affect its Mac business, so this would be a very risky move.

But given that AI hardware is likely much higher margin than the Mac business an argument could probably (sadly) be made that it'd be lucrative for them to try it. I personally don't think Apple is inclined to take this kind of risk to jeopardize the Mac, but I'm sure some people at Apple have considered this.

reply
I guess I mean for apple to remain as apple, they would not do this due to company culture.
reply
Yeah nothing about Apple is server side and imho that's what training is. To be serious about it as a company you have all sorts of other tools (crawlers, etc...) helping with training so it basically has to be in the datacenter at any reasonable scale anyway. And that's just not where Apple lives. We saw with Swift that they couldn't focus on server side enough to make it a serious language there and they've consistently declined to enter that area over the years because it's outside their wheelhouse.
reply
Trust me: If Intel could, it would.

From inside news: They were not breaking even on their existing GPUs. The strategy was to take a loss just to have a presence in the space.

reply
Intel could position their cards as strong for certain workloads. They had AV1 support first in market, for example.
reply
Intel doesn't limit how much memory card makers can pair with their GPU. It's up to the card maker.
reply
> And thus Nvidia is, now quite durably, the most valuable company in the world.

Nvidia is the most valuable company in the world right up until the AI bubble pops. Which, while it's hard to nail down when, is going to happen. I wouldn't call their position durable at all.

reply
The crashing and burning of Nvidia stock has been predicted for a while now and keeps not really happening. It’s gone pretty flat and volatile up there around $180 but they keep delivering the results to back it up. I was thinking this week that Apple is really primed to make a killing from people who want to run their LLM on-device coupled with an agent in the next couple of years. We’re a long way off being able to train the models – this is going to need an Nvidia-powered datacentre for the foreseeable future, but the local inference seems absolutely like a market that Apple could capture, gutting all the most premium revenue from Anthropic and OpenAI by selling Macs with a large amount of integrated memory to anyone who wants to give them the money to run their native OpenClaw/agent instead of paying ever-growing monthly bills for tokens.
reply
It is definitely a case that they will fall a long way but Nvidia will not fail as a whole. They have a way of maximizing their position relentlessly. CUDA turns out to endlessly put them in amazing positions on things like image recognition, AR, Crypto and now AI.

For all the faults of them leaning in hard on these things for stock market and personal gains, Nvidia still has some of the best quality products around. That is their saving grace.

They will not be the world most valuable company once the bubble pops, will probably never get back there again, but they will continue to be a decent enough business. I just want them going back to talking about graphics more than AI again, that will be nice.

reply
I might as well say that no, it is not going to happen.

As handwriting code is rapidly going out of fashion this year, it seems likely AI is coming for most of knowledge work next.

And who is to say that manual labor is safe for long?

reply
Apple makes AI inference and training servers by the thousands. They just don't sell them to anyone. They use them internally in their datacenters. They didn't drop the ball, they are playing a different game while not cannibalizing their existing customer base.
reply
They didn’t drop the ball at all?

They want to be able to sell handsets, desktops and laptops to their customer base.

Pursing a product line that would consume the finite amount of silicon manufacturing resources away from that user base would be corporate suicide.

Even nvidia has all but dropped support for its traditional gaming customer base to satisfy its new strategy.

At any rate, the local inference capabilities are only going to get cheaper and more accessible over the coming years, and Apple are probably better placed than anyone to make it happen.

reply
Don’t mistake stock market performance for revenue. NVIDIA makes ~200B annually, same as what Apple makes from iPhones. It’s a big market but GPUs aren’t just AI.
reply
I'm purely talking in terms of revenue. There's a huge demand for AI systems from personal workstations to datacenter servers, and Apple was one of the few companies in the world in a position to build complete systems for it.

But for some reason Apple thought the sound recording engineer or the video editor market was more important... like, WTF dude? Have some vision at least!

reply
Some people at Apple see it. That’s why they added matmul to M5 GPU and keep mentioning LMStudio in their marketing.
reply
Their rule of only releasing major software updates once a year in June is holding them back IMO. Their local LLM apis were dated before macOS/iOS 26 was even released. Just because something worked 20 years ago doesn’t mean it works today, but I’m sure it’s hard to argue against a historically successful strategy internally.
reply
Huh? What local LLM apis? It uses Metal.
reply
Apple abandoned the pro video editor market many years ago with the trashcan mac pro - theyre "prosumer" only at best.
reply
Apple already seems to do pretty well when it comes to AI systems on personal computers. Datacenters simply isn't their business, it would need some major changes on their part. Also, AI is a bubble, it will burst eventually, and because Apple doesn't have the fist mover advantage Nvidia has, they have a lot to lose entering this market now.

Sound recording engineers and video editors will not disappear after the AI bubble bursts, and Apple is wise to keep that market. Bursting the AI bubble will not make AI disappear, it will just end the crazy cashflows we are seeing now. And in that regard, with the capabilities of their hardware, Apple is in a pretty good spot I think.

reply
It is more important. Both for the customer base that actually buys Apple machines as well as the cache and mindshare of being used by the people that create American culture.

Even if Apple had an amazing GPU for AI it wouldn’t matter hugely - local inference hasn’t taken off yet and cloud inference and training all uses servers where Apple has no market share and wasn’t going to get it since people had already built all the stacks around CUDA before Apple could even have awoken to that.

reply
$280b and growing 70% YoY.

$1t backlog in orders in next 2 years.

reply
Those back log orders are wild! One does wonder that if the bubble collapses or more global upsets happen in that time, how many of those will ever be fulfilled? Reality might be not so impressive, but considering if it fell even 80%, that is still $200 B in revenue and that is huge.

Remember when a $1 billion valuation used to be a big thing? That is nothing compared with nowadays.

reply
Just look at the price of H100 cloud rental prices. Demand is increasing.
reply
Nah, Apple made the right choice. Nobody except a niche market of hobbyists is interested in running tiny quantized models.
reply
About the same niche market as the people who bought the Apple I, and we know where that went.
reply
The Apple I was a pretty poor predictor of what mainstream mass-market computing was going to end up looking like. I don't think anybody has yet come up with the Apple II of local LLMs, let alone the VisiCalc or Windows 95.
reply
If my Grandma had wheels she would be a bicycle. Apple would need to transition from being a consumer electronics company to being a B2B retailer for data centre hardware to take advantage of this.

Obviously Siri from WWDC 2yrs ago was a disaster for Apple. Other than that they seem to have done pretty well navigating the new LLM world. I do think they would benefit from having their own SOA LLM, but I don’t think its is necessary for them. My mental model for LLMs and Apple is that they are similar Garage Band - “Now everyone can play an instrument” becomes “now anyone can make an app”. Apple owns the interface to the user (i don’t see anyone making nicer to use consumer hardware) and can use what ever stack in the background to deliver the technical features they decide to.

reply
If Apple doesn't offer a Linux product, they cannot be used seriously in headless computing task. They are adamant in controlling the whole stack, so unless they remake some server version of macOS (and wait years for the community to accustom themselves with it), they will keep being a consumer/professional oriented company
reply
Apple didn’t drop the ball - they have no interest in creating servers for a limited time bubble. It is laughable that anyone would think that market would be bigger than iPhone - there’s a reason no RAM manufacturer is building new plants to take advantage of the current demand - they don’t expect it to last long enough to pay for their investment.
reply
> AI training as well as inference

Inference has never been an issue for M series, and MLX just ramped it up further.

You can do training on the latest MBPs, although any serious models you are going to the cloud anyway.

reply
> They had the infrastructure and custom SoCs and everything. What a waste.

What are they wasting, exactly?

reply
this is what needs to come back with modern hardware and modern interconnect

https://en.wikipedia.org/wiki/Xserve

reply
> something competitive with Nvidia for AI training

Apple is counting on something else: model shrink. Every one is now looking at "how do we make these smaller".

At some point a beefy Mac Studio and the "right sized" model is going to be what people want. Apple dumped a 4 pack of them in the hands of a lot of tech influencers a few months back and they were fairly interesting (expensive tho).

reply
> Apple is counting on something else: model shrink

The most powerful AI interactions I've had involved giving a model a task and then fucking off. At that point, I don't actually care if it takes 5 minutes or an hour. I've cued up a list of background tasks it can work on, and that I can circle back to when I have time. In that context, smaller isn't even the virtue at hand–user patience is. Having a machine that works on my bullshit questions and modelling projects at one tenth the speed of a datacentre could still work out to being a good deal even before considering the privacy and lock-in problems.

reply
What "tooling" do you use to let AIs work unattended for long periods?
reply
> What "tooling" do you use to let AIs work unattended for long periods?

Claude and Kagi Assistant. I tried tooling up a multi-model environment in Ollama and it was annoying. It's just searching the web, building models and then running a test suite against the model to refine it.

reply
Cool? And it has nothing to do with what kind of consumer hardware Apple should sell. If your use cases are literally "bigger model better" then the you should always use cloud. No matter how much computing power Apple squeezes into their device it won't be a mighty data center.
reply
For running the model once it’s been trained, all a datacenter does is give you lower latency. Once the devices have a large enough memory to host the model locally, then the need to pay datacenter bills is going to be questioned. I’d rather run OpenClaw on my device plugged into a local LLM rather than rely on OpenAI or Claude.
reply
deleted
reply
> At some point a beefy Mac Studio and the "right sized" model is going to be what people want.

It's pretty clear that this isn't going to happen any time soon, if ever. You can't shrink the models without destroying their coherence, and this is a consistently robust observation across the board.

reply
I don’t think it’s about literally shrinking the models via quantization, but rather training smaller/more efficient models from scratch

Smaller models have gotten much more powerful the last 2 years. Qwen 3.5 is one example of this. The cost/compute requirements of running the same level intelligence is going down

reply
There are no practically useful small models, including Qwen 3.5. Yes, the small models of today are a lot more interesting than the small models of 2 years ago, but they remain broadly incoherent beyond demos and tinkering.
reply
I have said for a while that we need a sort of big-little-big model situation.

The inputs are parsed with a large LLM. This gets passed on to a smaller hyper specific model. That outputs to a large LLM to make it readable.

Essentially you can blend two model type. Probabilistic Input > Deterministic function > Probabilistic Output. Have multiple little determainistic models that are choose for specific tasks. Now all of this is VERY easy to say, and VERY difficult to do.

But if it could be done, it would basically shrink all the models needed. Don't need a huge input/output model if it is more of an interpreter.

reply
Yes, but bigger models are still more capable. Models shrinking (iso-performance) just means that people will train and use more capable models with a longer context.
reply
Of course they are! Both are important and will be around and used for different reasons
reply
Cheaper than what you’d expect though. You could get a nice setup for $20-40k 6mo ago. As far as enterprise investments go, that’s a rounding error.
reply
Not all enterprises are the same, I imagine many companies have different departments working with local optimums, so someone who could benefit from it to get more productivity might not have access to it because the department that is doing hardware acquisition is being measured in isolation.
reply
I think it’s a little unnecessary to lecture somebody on HN about how enterprises come in different shapes and sizes. It’s pretty clear what I’m implying here if you aren’t actively trying to assume the most reduced, least charitable version of my statement.
reply
Drop that down to 5k, and make it useful.

Give every iPhone family a in house Siri that will deal with canceling services and pursuing refunds.

Your customer screw up results in your site getting an agent drive DDOS on its CS department till you give in.

Siri: "Hey User, here's your daily update, I see you haven't been to the gym, would you like me to harass their customer service department till they let you out of their onerous contract?"

reply
I’m running modest setup using a mistral model (24B) on a 9070 (AMD) and 32gb of ram. $1800 machine at the time I built it. It ultimately boils down to what you want to do with it. For me, it’s basically a drafting tool. I use it to break through writer’s block, iterate, or just throw out some ideas. Sometimes summarize but that can be hit or misss.

I don’t need the latest and greatest and I fine tuned LM studio enough that I get acceptable results in 30 to 90 seconds that help me keep moving ahead. I am not a software engineer, I am definitely not as much of a “coder” as the average person on HN. So if I can do it for less than $2000, I bet a lot of (smarter/experience coding) people could see great results for $5000.

You can get an M3 ultra Mac studio with 96gb ram for $4000. If you’re willing to go up to $6k it’s 256gb. Wayyyyy more firepower than my setup. I imagine plenty powerful for a lot of people.

reply
How is this dropping the ball? I think they dropped the ball a long time ago by waiting until M5 to do integrated tensor cores instead of the separate ANE only which was present before.

For multi-gpu you can network multiple Macs at high speed now. Their biggest disadvantage to Nvidia right now is that no one wants to do kernel authoring in Metal. AMD learned that the hard way when they gave up on OpenCL and built HIP.

reply
Nothing is a bigger market than the iPhone, let alone expensive niche machines.
reply