The real efficiency win in these chips is that they are made for inference only. You can throw away the vast majority of a chip if you only need a few ops, a single precision (like INT8 or FP8) and don't need ultra fast interconnects.
Google’s internal review blocked it from publication. Stated reasons were about paper quality. You can speculate whether that was the real reason.
Gebru issued an ultimatum email and said she would resign if some list of conditions weren’t met.
Google said “thanks, we accept your resignation”.
She claims it is retaliation, but it seems more like an own-goal if you ask me. She basically handed Google the solution to their problem.
Practical lesson: don’t tell your employer you might quit before you’re ok with leaving.