upvote
I am quite confident that an LLM will never beat a top chess engine like Stockfish. An LLM is a generalist -- it contains a lot of world knowledge, and nearly all of it is completely irrelevant to chess. Stockfish is a specialist tuned specifically to chess, and hence able to spend its FLOPs much more efficiently towards finding the best move.

The most promising approach would be tune a reasoning LLM on chess via reinforcement learning, but fundamentally, the way an LLM reasons (i.e. outputting a stream of language tokens) is so much more inefficient than the way a chess engine reasons (direct search of the game tree).

reply