If open source models are ~3-6 months behind SOTA, and ~opus4.6 capabilities are good-enough for product market fit, do the frontier labs have half a decade to catch up on their prior burn?
AI cost ballooning faster than companies can afford is becoming a very common topic in my circles right now. The era of "I'll pay infinitely more for marginal gains" is over from what I can tell.
That's doing a lot of work here.
The future I see isn't most companies buying hundreds of thousands in hardware to run models, it's them adding a line item to their AWS bill. Inference costs on the larger hosted open source models are dramatically lower than the frontier labs API pricing.
Just think how much further that $100K would have gone if the hardware market wasn't so screwed-up.
Anecdote: I priced-out adding 1TB of RAM to a four node cluster a couple months ago. The cluster was purchased in fall of 2024 w/ 4 nodes, each with 256GB RAM. The nodes cost just over $14K apiece back in 2024 (entire box, not just the RAM).
Dell wanted >$90K a couple months ago to add 256GB to each node.
What makes you so confident about this prediction? Hardware costs haven't exactly been cratering recently.
The decadal move to all-cloud-all-the-time killed off in-house hardware teams while the C-suite chased their OpEx dreams.
It would be interesting if we come full circle on this.
I was going to say - the models are just going to keep growing at a pace exceeding the pace of hardware pricing/availability
But then I realised that, far more likely, there will be a plateau reached (again) where nobody is seeing gain, and at that point hardware will catch up