upvote
Why would token speed matter for anything other than getting work done faster? It's in the name - "speed".
reply
This would be true if the models were capable of always completing the tasks. But, since their failure rate is fairly high, going in a wrong direction for longer could mean that you take more time than a faster model, where you can spot it going wrong earlier.
reply
Yeah, it’s like drinking coffee when being really tired. You’re still tired, just “faster”, it’s a weird sensation.
reply
It's even more strange how its not obvious to someone who uses codex extensively daily.

The rate limiting step is the LLM going down stupid rabbit holes or overthinking hard and getting decision paralysis.

The only time raw speed really matters is if you are trying to add many many lines of new code. But if you are doing that at token limiting rates you are going to be approaching the singularity of AI slop codebase in no time.

reply
deleted
reply
[dead]
reply