upvote
Let's use an example of a GW AI deployment.

At $0.07/kWh, that costs $70,000 every hour in just electricity. $1.7 million /day. $613 million /year.

I had claude estimate the GPU cost of such a deployment:

> To get racks per GW: a full NVL72 rack draws roughly 130-132 kW under full load. If a 1 GW facility runs ~715 MW of IT power (after a ~1.4 PUE for cooling), that's on the order of 4,000–4,500 racks. At $3.4M of compute hardware each, the GPU-system cost lands around $14–15 billion.

15 billion / 613 million / year = ~24.5 years til electricity costs catch up to the GPUs. Obviously electricity isn't 100% of OpEx, but I'd expect it to be the majority for AI deployments.

Regardless, if you can cut the $613 million/yr in half that's still massive savings.

reply
Do they? Genuinely ansking.
reply
Yep, I was surprised to learn that too.
reply
For a small cluster no, but at major data center level yes. Which is why they building data centers bigger than stadiums.

If you spend 10B on a data center, roughly 30% of that price is going to hardware, so roughly $ 3B.

So for two data centers you're spending 20B.

Now, assume there's hardware that performs twice as fast at same energy (watt/token), even if it costed you twice you're saving 7B because you don't need the second data center.

You get the same output of $ 20 B out of a $ 13 B initial investment, but you're also halving operational costs: less staff, less lawyers, etc, etc.

This is the reason why Nvidia is making gargantuan margins: hyper scalers don't really care about hardware cost, if they can get double the output and save themselves 30-40% of total costs and 50% of the headaches they will keep buying at twice the price gen over gen.

reply