upvote
except for you know the enterprise customers who won't change their code and will pay to run old inefficent hardware just to keep from dealing with upgrades?
reply
They can just ask Claude to upgrade it for them, completing the circle!
reply
I'd agree. but also that's too scary. and the bottleneck is the massive manual change control process since there's no automation around any of this. :)

Why take risk when you can spend money and take no risk

reply
As long as the demand for GPUs keeps increasing, there are more data centers being built to house them.

When you have waitlists for many many months for Blackwell GPUs, keeping the old ones around as long as customers are willing to pay for them is great.

If I as a customer have a use case for a machine learning model I developed awhile ago, so an insect identification model, I had an ML researcher/eng develop it back in 2019, and it runs fine on a 2018-era T4 GPU (NVidia 2080 era), why mess with it?

reply
We aren't talking about insect identification models from 2019.
reply
What do you think are running on the T4 GPUs in AWS? A lot of the use cases I know of for them are mid-level computer vision models that don't need to be frontier level.
reply
I can no longer edit this, but want to expand on my comment.

I've seen those vision researchers want to train on H100s at the time and being told know, wait for the T4s.

I've seen T4s running BERT models for document classification.

When there are enough Blackwells in data centers that H100s are useless for inference by your standards (I don't know if we've arrived there or not yet), there will be people who, say, want to run the Taco Bell ordering chatbot on them. There will be people who have applications that are just fine with Qwen 2.5 who will be happy renting them.

There seems to be this crazy consensus that hyperscalers are going to go into their datacenters and throw away their old GPUs. The reality is they have a ton of paying customers for them.

And there may be insect identification apps from 2019 that say "you know what? H100s have gotten cheap enough I can use a VLLM so the user can describe where they saw the insect too", or the McDonald's website support chatbot developers say "Hey, the bigger cheapers have gotten cheap enough we can upgrade our models to Qwen 2.5".

The frontier level GPUs in e.g. AWS have a huge premium. When the newer generations come out, they will be able to cut prices to a bit of a premium over the operational costs and still make a profit, and there are a ton of down-market customers who will be interested, who aren't willing to try to outbid Anthropic for Blackwells.

reply