upvote
One tip I have is that once you have the diff you want to fix, start a new session and have it work on the diff fresh. They’ve improved this, but it’s still the case that the farther you get into context window, the dumber and less focused the model gets. I learned this from the Claude Code team themselves, who have long advised starting over rather than trying to steer a conversation that has started down a wrong path.

I have heard from people who regularly push a session through multiple compactions. I don’t think this is a good idea. I virtually never do this — when I see context getting up to even 100k, I start making sure I have enough written to disk to type /new, pipe it the diff so far, and just say “keep going.” I learned recently that even essentials like the CLAUDE.md part of the prompt get diluted through compactions. You can write a hook to re-insert it but it's not done by default.

This fresh context thing is a big reason subagents might work where a single agent fails. It’s not just about parallelism: each subagent starts with a fresh context, and the parent agent only sees the result of whatever the subagent does — its own context also remains clean.

reply
Yeah, I start most of my sessions now with “read the diff between this branch and main”. Seems like it grounds and focuses it.
reply
Slight tangent: you want to read the diff between your branch and the merge-base with origin/main. Otherwise you get lots of spurious spam in your diff, if main moved since you branched off.
reply
One thing that seems important is to have the agent write down their plan and any useful memory in markdown files, so that further invocations can just read from it
reply
subagents are huge, could execute on a massive plan that should easily fill up a 200k context window and be done atnaround 60k for the orchestration agent.

as a cheapass, being able to pass off the simple work to cheaper $ per token agents is also just great. I've got a handful of tasks I can happily delegate work to a haiku agent and anything requiring a bit of reasoning goes to sonnet.

Feel like opus is almost a cheatcode when i do get stuck, i just bust out a full opus workflow instead and it just destroys everything i was struggling with usually. like playing on easy mode.

as cool as this stuff is, kinda still wish i was just grandfathered into the plan with no weekly limit and only the 5 hour window limits, id just be happily hammering opus blissfully.

reply
IMO it seems to start "forgetting" or "overlooking" claude.md well before the context window is full.
reply
>"This fresh context thing is a big reason subagents might work where a single agent fails. It’s not just about parallelism: each subagent starts with a fresh context, and the parent agent only sees the result of whatever the subagent does — its own context also remains clean."

This is the true power of agent teams: https://code.claude.com/docs/en/agent-teams

You maintain very low context usage in the main thread; just orchestration and planning details, while each individual team member remains responsible for their own. Allows you to churn through millions of output tokens in a fraction of the time.

reply
Same here. I don't understand how people leave it running on an "autopilot" for long periods of time. I still use it interactively as an assistant, going back and forth and stepping in when it makes mistakes or questionable architectural decisions. Maybe that workflow makes more sense if you're not a developer and don't have a good way to judge code quality in the first place.

There's probably a parallel with the CMSes and frameworks of the 2000s (e.g. WordPress or Ruby on Rails). They massively improved productivity, but as a junior developer you could get pretty stuck if something broke or you needed to implement an unconventional feature. I guess it must feel a bit similar for non-developers using tools like Claude Code today.

reply
>Same here. I don't understand how people leave it running on an "autopilot" for long periods of time.

Things have changed. The models have reached a level of coherence that they can be left to make the right decisions autonomously. Opus 4.6 is in a class of its own now.

reply
A non-technical client of mine has built an entire app with a very large feature set with Opus. I declined to work on it to clean it up, I was afraid it would have been impossible and too much risk. I think we are at a level where it can build and auto-correct its mistakes, but the code is still slop and kind of dangerous to put in production. If you care about the most basic security.
reply
Branch first so you can just undo. I think this would have worked with sub agents and /loop maybe? Write all items to change to a todo.md. Have it split up the work with haiku sub agents doing 5-10 changes at a time, marking the todos done, and /loop until all are done. You’ll succeed I suspect. If the main claude instance compacts its context - stop and start from where you left off.
reply
It actually did automatically break the work up into chunks and launched a bunch of parallel workers to each handle a smaller amount of work. It wasn't doing everything in a single instance.

The problem wasn't that it lost track of which changes it needed to make, so I don't think checking items off a todo list would have helped. I believe it did actually change all the places in the code it should have. It just made the wrong changes sometimes.

But also, the claim I was responding to was, "I start with a PRD, ask for a step-by-step plan, and just execute on each step at a time." If I have to tell it how to organize its work and how to keep track of its progress and how to execute all the smaller chunks of work, then I may get good results, but the tool isn't as magical (for me, anyway) as it seems to be for some other people.

reply
The next line in the comment you’re responding to is

> Sometimes ideas are dumb, but checking and guiding step by step helps it ship working things in hours.

which matches my experience exactly. I consider it to be about as magical as the parent comment is claiming, but I wouldn’t call it totally automatic.

reply
If you use eslint and tell it how to run lint in CLAUDE.md it will run lint itself and find and fix most issues like this.

Definitely not ideal, but sure helps.

reply
Undefined variable references? Did you not instruct it to run typescript after changes?
reply
deleted
reply
Start over, create a new plan with the lessons learned.

You need to converge on the requirements.

reply
You’re using it wrong. As soon as it starts going off the rails once you’ve repeated yourself, you drop the whole session and start over.
reply
One of the more subtle points that seems to be crucial is that it works a lot better when it can use the context as part of its own work rather than being polluted by unrelated details. Even better than restarting when it's off the rails is to avoid it as much as possible by proactively starting a new conversation as soon as anything in the history of the existing one stops being relevant. I've found it more effective to manually tell it most what's currently in the context in a fresh session skip the irrelevant bits even if they're fairly small than relying on it to figure out that it's no longer relevant (or give it instructions indicating that, which feels like a crapshoot whether it's actually going to prune or just bloat things further with that instruction just being added into the mix).

To echo what the parent comment said, it's almost frustrating how effective it can be at certain tasks that I wouldn't ever have the patience for. At my job recently I needed to prototype calling some Python code via WASM using the Rust wasmtime engine, and setting up the code structure to have the bytes for the WASM component, the arguments I wanted to pass to the function, and the WIT describing the interface for the function, it was able to fill in all of the boilerplate needed so that the function calls worked properly within a minute or two on the first try; reading through all the documentation and figuring out how exactly which half dozen assorted things I had to import and hook up together in the correct order would have probably taken me an hour at minimum.

I don't have any particular insight on whether or not these tools will become even more powerful over time, and I still have fairly strong concerns about how AI tools will affect society (both in terms of how they're used and the amount of in energy used to produce them in the first place), but given how much the tech industry tends to prioritize productivity over social concerns, I have to assume that my future employment is going to be heavily impacted by my willingness to adopt and use these tools. I can't deny at this point that having it as an option would make me more productive than if I refuse to use it, regardless of my personal opinions on it.

reply