upvote
I think you’re probably overthinking it. The same model will in my experience find errors in the same thing it just generated. I think reviewing is a totally different place in latent space, than implementing. Anyhow, it works well for me.

I often tell codex to launch a subagent without prior context to „remove BS phrases and make the prose sound more natural and higher readability“. That‘s usually enough to get better results.

reply