points
Service providers that do batch>1 inference are a lot more efficient per watt.
Local inference can only do batch=1 inference, which is very inefficient.