It is entirely plausible to me that Opus 4.7 is designed to consume more tokens in order to artificially reduce the API cost/token, thereby obscuring the true operating cost of the model.
I agree though, I chose poor phrasing originally. Better to say that GB200 vs Tranium could contribute to the efficiency differential.
Like Chinese versus English - you need fewer Chinese characters to say something than if you write that in English.
So this model internally could be thinking in much more expressive embeddings.