I would rather spend money on some pseudo-local inference (when cloud company manages everything for me and I just can specify some open source model and pay for GPU usage).
Except that newer "agent swarm" workflows do exactly that. Besides, batching requests generally comes with a sizeable increase in memory footprint, and memory is often the main bottleneck especially with the larger contexts that are typical of agent workflows. If you have plenty of agentic tasks that are not especially latency-critical and don't need the absolutely best model, it makes plenty of sense to schedule these for running locally.