The LLM inference itself may be more efficient (though this may be impacted by different throughput vs. latency tradeoffs; local inference makes it easier to run with higher latency) but making the hardware is not. The cost for datacenter-class hardware is orders of magnitude higher, and repurposing existing hardware is a real gain in efficiency.
If you're purely repurposing hardware that you need anyway for other uses, that doesn't really matter.
(Besides, for that matter, your utilization might actually rise if you're making do with potato-class hardware that can only achieve low throughput and high latency. You'd be running inference in the background, basically at all times.)
You might want to read this: https://arxiv.org/abs/2502.05317v2