upvote
Hmm well, from my perspective, none of them are even really playing the game, they are just taking random actions. Any human, even a small child, would be much better.

And re: ages, it's worth noting that the youngest player to make Day 2 of a Grand Prix is 8 years old, and the youngest Pro Tour winner was 15 years old. I don't think it's realistic to get an LLM anywhere close to either of those players in skill level, though it's absolutely possible with a specialized model.

reply
> , so you can say things like "Grok plays as well as a 7-year-old, whereas Opus is a true frontier model and plays as well as a 9-year-old".

no, no, no.. please think. Human child psychology is not the same as an LLM engine rating. It is both inaccurate and destructive to actual understanding to say that common phrase. Asking politely - consider not saying that about LLM game ratings.

reply