That world-state can be anything, but in the last year or two, the term has taken a narrower meaning: a video generation model that reacts naturally to game-like controls, as if it was simulating a videogame. But there's no additional state behind the video frames.