upvote
One query is not going to be a useful benchmark when people are deploying AI swarms in loops to solve simple problems
reply
Or a human riding a stationary bike for 36 seconds.
reply