upvote
Os there a cost benchmark out there? I wonder how frontier models are doing over time for cost per problem solved.
reply
I think they are optimizing for one-shot performance because that will drive usage. They can’t afford to look bad in the benchmarks. And if that means consuming an order of magnitude more tokens, well, that’s good for business, too.
reply