upvote
This is not observable from LLM inference, where you would not encounter uniform matrices.

Power limiting does not improve performance but it does improve efficiency. You might be able to get 90% of the performance for only 70% of the power usage, for example. It does not make the card go faster though.

reply
When thermal throttling occurs you can perform faster by running slower.

This is precicely because of the efficiency. The lower efficiency of the higher speed triggers a much lower performance sooner.

reply
> When thermal throttling occurs you can perform faster by running slower.

This is not true unless the throttling algorithm is so broken that it's oscillating between extremes.

The parts have a curve of clock speed versus voltage. More clock speed means higher performance. That goes further up the voltage curve, meaning more power.

Throttling just moves the card further down the voltage to clock speed curve. It reduces clock speed, reducing performance.

The cards don't "perform faster by running slower". If you run the card slower, it performs slower.

reply
with a lower power cap set, it runs cooler, which sometimes allows the GPU to reach higher boost speeds. This is a real effect on gaming GPUs - however I have no idea if it applies to datacenter GPUs
reply
In general, constraints require optimizations and rearchitectures. I'd also expect the ram shortage for instance to have a big impact on the software industry as a whole, specially in games. They will need to make do with what people have, a ps5/pro or similar in PC power.
reply
I actually think it is a good thing to introduce constraints to AI and the overall tech industry. Hopefully everyone will have to look at improving performance without having to add RAM or increase CPU/GPU performance.
reply
As long as these constraints are for everyone and not just for thee and not for me, and become an instrument for big tech to keep consumers dependent on their infra.
reply