Codex is the best at following instructions IME. Claude is pretty good too but is a little more "creative" than codex at trying to re-interpret my prompt to get at what I "probably" meant rather than what I actually said.
It won't make any changes until a detailed plan is generated and approved.
Using Gemini 2.5 or 3, flash.
I am familiar with copilot cli (using models from different providers), OpenCode doing the same, and Claude with just the \A models, but if I ask all 3 the same thing using the same \A model, I SHOULD be getting roughly the same output, modulo LLM nondeterminism, right?