Please tell more :). Do you pay per token from bedrock / openrouter / somewhere else? How many tokens you use over the month, and how many for each task? Which harnesses?
Pay for OpenAI Pro directly, but I’m the only guy that uses Codex. $100 a month. My nontechnical partner likes to talk to ChatGPT 5.5 Pro for image related tasks (think generating interior decorating pics).
The nontechnical staff use a Gemini account on a Google family AI Pro sub. I use Antigravity when working on Android or Google Cloud API codebases.
Everyone gets OpenCode Go. The cost is trivial. $10 a month per person.
Pay for MiMo directly. We use it during Chinese off peak hours though. Total spend so far $25 in last month.
We run a few Qwen models locally and pretty much have them pegged all day. RTX 5090 on a PC and a Mac Studio.
There’s also Grok which is used for Imagine for artistic / graphic design related work. I also use the subscription for a vision model in my oh-my-pi harness.
We’re having discussions about how to pull in GLM-5.2 cost effectively. We compete with third world development shops so we can’t really pass on inference costs, but we can benefit from getting jobs done for customers faster. But ⅔ of our work is either internal or open source projects we can’t bill for.
I can manage this budget with the chinese models in AWS BedRock. However, in my experience, they aren't as good as claude today.
How do you know that the other models you are referring to aren't subsidized?