That's likely the datarate of the ADC chips. You would downsample them directly on the FPGA board and maybe perform an FFT or similar transform. 16 TB/s across a few dozen FPGA boards is nothing crazy. After some early stages in the signal processing you might transfer 1 or 2 TB/s over ethernet to the servers. Entirely feasible considering we have 800 gigabit/s ethernet.
You’re completely right, this is why currently ultrasound reconstruction happens on FPGAs. They would need a lot of them given the number of transducers.
https://pmc.ncbi.nlm.nih.gov/articles/PMC6057541/