upvote
I don’t think that’s true. Claude Opus 4.5/4.6 in Cursor have marked the big shift for me. Before that, agentic development mostly made me want to just do it myself, because it was getting stuck or going on tangents.

I think it can (and is) shifting very rapidly. Everyone is different, and I’m sure models are better at different types of work (or styles of working), but it doesn’t take much to make it too frustrating to use. Which also means it doesn’t take much to make it super useful.

reply
> I don’t think that’s true. Claude Opus 4.5/4.6 in Cursor.

Opus 4.6 has been out for less than a month. If it was a big shift surely we'd see a massive difference over 4.5 which was november. I think this proves the point, you're not seeing seisimic shifts every 3 months and you're not even clear about which model was the fix.

> I think it can (and is) shifting very rapidly.

Shifting, maybe. But shuffling deck chairs every 3 months.

reply
I interpreted their comment to mean 4.5 was the shift, which was nov last year. "Before that" meaning pre 4.5.
reply
It depends on what you're handling. Frontend (not css), swagger, mundane CRUD is where it shines. Something more complex that need a bit harder calculation usually make the agents struggling.

Especially good to navigate the code if you're unfamiliar with it (the code). If you have known the code for good, you'll find it's usually faster to debug and code by yourself.

Opus 4.6 with claude code vscode extension

reply
Have you tried it with something like OpenSpec? Strangely, taking the time to lay out the steps in a large task helps immensely. It's the difference between the behavior you describe and just letting it run productively for segments of ten or fifteen minutes.
reply
> Have you tried it with something like OpenSpec?

No. The parent comment said I needed a new model, which I've tried. Being told "just try something else aswell" kind of proves the point.

reply
I thought this too and then I discovered plan mode. If you just prompt agent mode it will be terrible, but coming up with a plan first has really made a big difference and I rarely write code at all now
reply
My workflow has become very plan-intensive... including planning of verification+test steps at the end.
reply
Agree, it’s strange, I will just assume that the people who say this are building react apps. I still have so much ”certainly, I should not do this in a completely insane way, let me fix that” … -400+2. It’s not always, and it is better than it was, but that’s it.
reply
I'm an ML engineer, so it's mostly been setting up data processing/training code in PyTorch, if that helps.
reply
At this point though, after Claude C Compiler, you've got to give us more details to better understand the dichotomy. What do you consider simple issues?
reply
> At this point though, after Claude C Compiler,

Perfect example. You mean the C compiler that literally failed to compile a hello world [0] (which was given in it's readme)?

> What do you consider simple issues?

Hallucinating APIs for well documented libraries/interfaces, ignoring explicit instructions for how to do things, and making very simple logic errors in 30-100 line scripts.

As an example, I asked Claude code to help me with a Roblox game last weekend, and specifically asked it to "create a shop GUI for <X> which scales with the UI, and opens when you press E next to the character". It proceeded to create a GUI with absolute sizings, get stuck on an API hallucination for handling input, and also, when I got it unstuck, it didn't actually work.

[0] https://github.com/anthropics/claudes-c-compiler/issues/1

reply
Claude C compiler is 100k LOC that doesn’t do anything useful, and cost $20k plus the cost of an expert engineer creating a custom harness and babysitting it.

But the most important thing is that they were reverse engineering gcc by using it as an oracle. And it had gcc and thousands of other c compilers in its training set.

So if you are a large corporation looking to copy GPL code so that you can use it without worrying about the license, and the project you want to copy is a text transformer with a rigorously defined set of inputs and outputs, have at it.

reply