upvote
Subscription inference can also be cheaper than the cost of API inference if the provider wants it to -- providers can do flexible scheduling for subscription inference for example, around API inference, to lower its cost and get better utilization of the hardware.
reply