upvote
Yeah, that's the part that just seems to be wildly under-discussed to me.

If open source models are ~3-6 months behind SOTA, and ~opus4.6 capabilities are good-enough for product market fit, do the frontier labs have half a decade to catch up on their prior burn?

AI cost ballooning faster than companies can afford is becoming a very common topic in my circles right now. The era of "I'll pay infinitely more for marginal gains" is over from what I can tell.

reply
Open source models that you can run locally are much more than 3 to 6 months behind. 6 months was the November inflection for Claude. No open source model is as good as Claude Opus 4.6.
reply
> that you can run locally

That's doing a lot of work here.

The future I see isn't most companies buying hundreds of thousands in hardware to run models, it's them adding a line item to their AWS bill. Inference costs on the larger hosted open source models are dramatically lower than the frontier labs API pricing.

reply
> ...we are already looking at dropping $100k on hardware to run local models...

Just think how much further that $100K would have gone if the hardware market wasn't so screwed-up.

Anecdote: I priced-out adding 1TB of RAM to a four node cluster a couple months ago. The cluster was purchased in fall of 2024 w/ 4 nodes, each with 256GB RAM. The nodes cost just over $14K apiece back in 2024 (entire box, not just the RAM).

Dell wanted >$90K a couple months ago to add 256GB to each node.

reply
> In a few years, there will be hardware capable of running frontier models good enough for most things at accessible prices for even tiny companies.

What makes you so confident about this prediction? Hardware costs haven't exactly been cratering recently.

reply
Do you think this will be a trend for larger companies as well?

The decadal move to all-cloud-all-the-time killed off in-house hardware teams while the C-suite chased their OpEx dreams.

It would be interesting if we come full circle on this.

reply
I’m curious: are you spending on beefy developer machines, or some kind of shared local inference server? Would be interested to know more if it’s the latter.
reply
I am aware of at least a handful of companies doing the latter. I don’t work for them and cannot speak to their setup.
reply
I configured a dual DGX Spark cluster, and it's certainly "good enough" for my agentic and coding needs.
reply
What models? Last I tried different local modals there was a pretty big difference from frontier.
reply
> In a few years, there will be hardware capable of running frontier models good enough for most things at accessible prices for even tiny companies.

I was going to say - the models are just going to keep growing at a pace exceeding the pace of hardware pricing/availability

But then I realised that, far more likely, there will be a plateau reached (again) where nobody is seeing gain, and at that point hardware will catch up

reply