What's unclear to me is how much Google uses GPUs for their own stuff. Yes Gemini runs on GPUs now, so that Google can sell Gemini on-prem boxes (recent release announced last week), but is any training or inference for Gemini really happening on GPUs? This is unclear to me. I'd have guessed not given that I thought TPUs were much cheaper to operate, but maybe I'm wrong.
Caveat, I work at Google, but not on anything to do with this. I'm only going on what's in the press for this stuff.
Do you have any more information on this? I only found this article about it: https://venturebeat.com/technology/googles-gemini-can-now-ru...
It mentions that Gemini can run on eight NVIDIA GPUs, but not which GPU and which Gemini model. Either way, this puts an upper bound of 288 * 8 = 2304 GB on the size of the Gemini model, which as far as I know has been a secret until now.