upvote
And not just token usage, expensive token usage; when it comes to tokens/joule not all tokens are equal. Efficient use of MoE architectures does have an impact on tokens/joule and tokens/sec.
reply
I like the intelligence per watt and intelligence per joule framing in https://arxiv.org/abs/2511.07885 It feels like a very useful measure for thinking about long-term sustainable variants of AI build-outs.
reply