To be cost effective with inference providers, you have to find some way to be using it 24/7.
If they decided to collude, they could absolutely say "from now on you no longer have access to model X because you're an asshole"
The commercial inference offering are also downstream of one of those 3 projects (or trt-LLM if they're nvidia). It would impact Ollama, and fireworks, together, and everyone else.
Don't tempt fate.