I've got a local setup too but unless you consider hardware zero cost, there is really no way to save money. The class of model you can run on <$5k of hardware is dirt cheap to run in the cloud (generating tokens 24/7 non-stop is a few dollars a day at most, possibly even less than the cost of electricity to do it at home).