If my gpu is sitting idle, and I mean idle with nothing loaded into its memory, it's sitting at about 18W. If I load in model that uses nearly all of the memory but that model is idle, it's at 36W. If that model is actively thinking, it's like 118W. I think this is likely due to the GPU being aware that there is real data loaded into memory and turning up the DRAM refresh rate whereas when nothing is loaded, the dynamic power is as low as possible.
Yes, I have some of these cards and AFAICT the HBM2e chips just always run at full speed. I have different variants of the pcie cards and while I can get the gpu itself into a lower power state the memory just runs full tilt. Though I see 40w on my “normal” cards and 60w on the Frankenstein card that thinks it’s an sxm4.