upvote
to add to that, for example at the end of implementing a task, where the model runs the formatters, linters, tests, commits, pushes, this could be done by a very cheap model, and only switch to the main model again if something fails hard

there are some cache-busting considerations, but solvable

reply
indeed. i also wrote elsewhere that the current ideal number of models in a pool is probably 2, so if you route between two both will have warm cache, though not the full cache at all times, so you lose a little but not much.
reply