On Openrouter, the cheapest GLM 5.2 provider costs $3/MTok (at 44 tps). Assuming most use is output tokens, that's still the equivalent of 450k token/day, so we're in the same ball park, but without the capex for 2 3090's and the machine.
Self hosted only makes economic sense if your priority is being in control / avoiding surveillance.
Running a system that will be 600W under max CPU usage on all cores and RAM and a few 3090-class GPUs, that same system might be only 90W or around there when idle at 0.00 unix load.
If we say: (600 * 24 * 31)/1000 = 446kWh in a month at full load 24 hours a day
But it could be less, such as: (90 * 12 * 31)/1000 = 33.48 kWh of idle time in a month, and 223kWh of "full load" 600W time in a month, if it's at full load only 12 hours a day.
If you're the only user accessing it and you only "use" it 12 hours a day, that cumulative yearly dollar figure would be almost halved. Or even less if a person is using it in bursts and intermittently throughout an 8 hour workday.
You can’t do that with 6 tps, though.
No, you would pay usage based rates with API, in this case. I have exactly one fixed monthly rate for the 6 AI models I have tokens available for.
It isn't 100% efficient. Even the best PSUs aren't.
There is no "ubiquitous" geothermal where there also high power usage. Data centers have to go where power is, not can be.
[1] https://en.wikipedia.org/wiki/List_of_geothermal_power_stati...
[1] https://www.cnbc.com/2025/03/12/amazon-google-and-meta-suppo...
[2] https://www.sciencenews.org/article/small-modular-nuclear-re...
[3] https://floodlightnews.org/fraud-and-corruption-on-rise-at-u...
[4] https://decarbonization.visualcapitalist.com/animated-70-yea...
There's also tons of opportunity to build them out in former pulp mill towns on Vancouver Island that have big interconnects or dedicated generation.
You'd have to be an idiot to put a datacentre in Vancouver, or have fuck-off scale monopoly money, which is probably why Telus is doing it.