So, a conversation that's ongoing with one model then switching to another would presumably send the whole conversation and the new question. Which defeats the purpose of splitting traffic...so, you're not wrong to question how this actually improves things for anything other than short sessions, which you could choose your own model for if it's a small problem.
you could even take this into account automatically to help decide
Another use case: You have two models on your local device. One is large and fairly powerful but low, the other is smaller, faster and good at tool calls and chat, but not great for writing and reviewing code. If you route between them per request, you can get a better developer experience with preserved performance.
The linked repo aims to help you achieve these things, as do I with the role-model router and protocol that I linked in another comment.
there are some cache-busting considerations, but solvable