undefined

points

[-]

Simple, for what I'm doing Opus 4.6 (and before that, Opus 4.5) are just much better at following my instructions and achieve consistently better results.

From what I've been gathering, this split in success seems to depend a lot on the types of tasks, the domains / programming languages / frameworks used, and style of prompting.

I couldn't get 5.2 to follow instructions for the life of me, even when repeating multiple times to do / not do something. 5.3-codex was an improvement and 5.4 while _usually_ decent still regularly forgets, goes on unnecessary tangents, or otherwise repeatedly stops just to ask for continuation.

Sure, I'm paying 3x more per request, but I'm also doing 5x fewer requests.

Or well, used to. Still bummed about them dropping 4.6.

by rectang1 days ago|

parent|

[-]

My experience is similar. Opus, especially Opus 4.5, understands my intentions better even when poorly phrased, and more consistently follows my instructions to do only what's necessary and no more.

As far as I can tell, the distinctive feature of my workflow is that I'm giving it small, contained single-commit-sized tasks and limited context. For instance: "For all controller `output()` functions under `Controller/Edit/` and `Controller/Report/`, ensure that they check `Auth::userCanManage`." Others seem to be taking bigger swings.

by aleksiy1231 days ago|

prev|

[-]

Anecdotally, I experimented GPT-5.4 xhigh and something about the code it wrote just didn't vibe with me.

It felt like I constantly have to go back and either fix things or I just didn't like the results. Like the forward momentum/progress on my projects overally wasn't there over time. Even with tho its cheaper it just doesn't feel worth it, to the point I start to feel negative emotions.

I'm actually a bit worried that I've somehow become to feel more negative emotions with agentic coding. Quicker to feel frustrated somehow when things aren't working.

by sunaookami22 hours ago|

prev|

[-]

GPT's output is awful and it gets even more awful when you try to work out a solution "together" because it shits out 10 paragraphs with 20 options instead of focusing and getting things done.

by theanonymousone1 days ago|

prev|

[-]

Same for me. I would still be happy with my Copilot Pro subscription if I could use 5.4 with 1x coefficient (and 5.4 mini with 0.33x).

But seeing that they are stopping to get new subscriptions, and rumours/evidence that they plan to increase coefficients of remaining models, it seems they want us to see "the writing on the wall"