upvote
Isn't this an orthogonal issue that doesn't affect whether billing is done with credits or money?
reply
If the expensive parts of the query happen to work iteratively (especially if agentic), you can act on those loops to bound the cost. Even if it's pure forward generation, you could pause an expensive inference and continue it seamlessly with a cheaper model, adding little to the cost.
reply
[dead]
reply
[dead]
reply