upvote
> A validator that checks "did the assistant reply?" instead of "was the reply correct?" was never a benchmark. It was a participation trophy

People can't even write a two paragraph comment without ai now

reply