Strictly for raw models, most now do train on chain-of-thought, but the planning step may need to be prompted in the harness or your own prompt. Since the model is autoregressive, once it generates a thing that looks like a plan it will then proceed to follow said plan, since now the best predicted next tokens are tokens that adhere to it.
Or, in plain english, it's fairly easy to have an AI with something that is the practical functional equivalent of intent, and many real world applications now do.
It's not a real reasoning step, it's a sequence of steps, carried out in English (not in the same "internal space" as human thought - every time the model outputs a token the entire internal state vector and all the possibilities it represents is reduced down to a concrete token output) that looks like reasoning. But it is still, as you say, autoregressive.
And thus - in plain english - it is determined entirely by the prompt and the random initial seed. I don't know what that is but I know it's not intent.
Anthropomorphism and Anthropodenial are two different forms of Anthropocentrism.
But the really interesting story to me is when you look at the LLM in its own right, to see what it's actually doing.
I'm not disputing the autoregressive framing. I fully admit I started it myself!
But once we're there, what I really wanted to say (just like Turing and Dijkstra did), is that the really interesting question isn't "is it really thinking?" , but what this kind of process is doing, is it useful, what can I do or play with it, and -relevant to this particular story- what can go (catastrophically) wrong.