points
If you are simply measuring Watt Cost per Token, you are missing the mark drastically. You have to measure quality output per Watt.
It sounds reasonably difficult to benchmark this, maybe I'm wrong though.