Nice idea. I just asked Haiku to do the same in Claude Chat on iOS: it created a interactive react game, implemented the rules and let it play. Clever move for 1$ input and 5$ output, Anthropic!
when i will be extremely bored, I think I will make two models play chess against each other. I bet there's a chess benchmark / llm tournament already somewhere