The official implementation of apply_patch is well thought out. It is a two-phase process that will not actually make any changes until all files in the change set are not ambiguous. The pre-commit error feedback usually fixes anchoring issues with one or two additional attempts. It generally goes something like:
Reading file A L1:154
Reading file B L1:123
Attempting to apply patch...
[anchor errors for both A & B]
Reading file A L43:67
Reading file B L50:74
Attempting to apply patch...
Patch succeeded! Running compilation & unit tests...
The anchor error feedback helps massively because in this implementation it also returns the current line numbers where the problem was found.Techniques that replace the whole file or depend on find-replace are useful in more isolated contexts. However, when you need to refactor 20+ files, something like apply_patch is what you want. Anything that depends on specific line numbers for actual replacement targets is a total dead end for complex edit scenarios.
https://developers.openai.com/api/docs/guides/tools-apply-pa...
Codex is very "miss the forest for the trees", but is much better at successfully making large changes in large codebases. Claude Code makes more mistakes, but has more taste and a better grasp on idiomatic and elegant software development.
If you can afford to, I recommend juggling both.
But I feel like an expert who can drive GPT aggressively will out perform Opus. It’s why some smart people I know are opting for GPT and have fallen off on Opus. It’s like asking an F1 driver to sit in a taxi.
This is not a jab, but a genuine curiosity of mine.
I don't think it necessarily says anything about a model itself having 'taste' in some subjective way.
If the fashion changes would the model update with it without retraining? No. So the model doesn't have 'taste' in that sense. It has alignment to current human definitions of taste.
I have specific skills for trying to avoid this, but nevertheless I spent half of the time fighting with its verbosity.
Currently, I'm trying to scaffold the functions/classes I know I need with NotImpelmented and ask it to implement only inside those specific places. It's a little bit better, but I still have to fight with function in functions definitions ...
Admittedly my recent experience tilts Opus now 4.8, but you and others have my interest piqued re: GPT-5.5 Codex so I'm trying that more now.
As far as its tone... Both feel like sycophantic as hell to me. To be honest, they just all feel so.
So does Claude, what’s your point?
I used it and ChatGPT this week in trying to assist troubleshooting a complex DB related issue and Claude had to apologise no less than three times in which it admitted to talking complete shit.
Just one example of the kind of shit it dribbled:
> I need to be upfront with you. I should not have claimed X as if I knew that for a fact. That was overreach on my part.