A100: 1935GBps of HBM2e
Most of those FLOPS are constrained by memory bandwidth.
but it is very impressive how far modern CPUs get as well (also in smart phones!)
I found the comparison interesting
on Intel Xeon 690P with 419 TFLOP/s it is still (maybe even more?) interesting to ask:
how much throughput can you reach with Python, Python with lib x, y, z, with C++ like this, with C++ like that etc etc and why?
no?
But this discussion is even more bizarre than comparing a screwdriver to a hammer, it’s like comparing a screwdriver to a nail.