There's also the concept of "smart routing" requests based on some heuristics / embeddings. You'd get "simple" tasks handled by smaller (cheaper) models and use a bigger model to curate / sort / merge the results.
There's a lot of things to try here. I wouldn't personally pay for this service, but I don't think it's "a joke"...
https://news.ycombinator.com/item?id=44630724
They randomly alternated between frontier LLMs and got a massive boost to performance on cybersecurity tasks.