I really need to shut up, or bite the bullet and by one.
If you graph the tokens per second on the 5090, your jaw will hit the floor at how cheap it is
The RTX 5090 is faster than an H200. It just has less ram (32 vs 141), doesn't have NVLink, and technically isn't allowed to be used in a datacenter.
The datacenter GPUs sell at an 80% margin. They're incredibly overpriced. But the laws of supply and demand are undefeated and so here we all are.
H200 has HBM and much more 64-bit compute
RTX 5090 has more CUDA cores that run at a higher clock speed. H200 has more RAM and significantly more RAM bandwidth.
Which one is net faster depends on your use case. But you may be very surprised that many workflows are faster on an RTX 5090!