upvote
It's true that the initial tool response still has the same amount of tokens but it doesn't keep dragged along in the longer-lived top context.
reply
The real benefit is being able to use a cheaper, but good enough, model with a specific system prompt dedicated to that task.
reply