upvote
20 tok/s is an average. It can be more, it can be less. If you are running off-peak I'm sure you'd get some crazy number.
reply
Why wouldn't developers just do llm arbitrage against openrouter if it is a better deal?
reply
For the same reason people don’t do server arbitrage because Hetzner is cheaper than AWS.
reply
The problem is different. OpenRouter is a router to LLMs. It doesn't solve GPU underutilization.
reply
What I am saying is if your system lets me pay $x/token and open router lets me pay $y/token if x<y then someone could make money just by providing those tokens through the open router API. That would either drive up demand for your systems increasing costs or drive up supply on open router decreasing costs. Eventually the costs would converge, no?
reply
deleted
reply