Now jump ahead 2 years and you seem to have a massive jump in performance [1]. The tokens/Watt goes up by at least 2 orders of magnitude. And the B100 is 3-4x that. And we're about to hit the R100 (Rubin) cliff.
That's what this is going to come down. When hyperscalar DCs are getting to Gigawatt power usage, it all comes down to power efficiency. Those A100s aren't far from being sold for scrap.
I've been looking into how different companies are handling depreciation for this. Amazon seems to be saying the life is 3-4 years, Google 4-5 and Meta is saying 8+, which I think is wildly optimistic.
[1]: https://lambda.ai/inference-models/deepseek-ai/deepseek-v4-f...