upvote
He wasn't saying that both of the models suck, but that the heuristics for measuring model capability suck
reply
..huh?
reply