upvote
One of the things you get an intuition for after using these systems is when to start a new conversation, and the basic rule of thumb is “always.” Use a conversation for one and only one task or question, and then start a new one. For longer projects, have the LLM write down a plan or checklist, and then have it tackle each step in a new conversation. The LLM context collapse happens well before you hit the token limits, and things like ground rules and whatnot stop influencing the LLMs outputs after a couple tens of thousands of tokens in my experience.

(Similar guidance goes for writing tools & whatnot - give the LLM exactly and only what it needs back from a tool, don’t try to make it act like a deterministic program. Whether or not they’re capital-I intelligent, they’re pretty fucking stupid.)

reply
Yeah, adherence is a hard problem. It should be feeling much better in newer models, especially Opus 4.5. I generally find that Opus listens to me the first time.
reply
Have been using Opus 4.5 and can confirm this is how it feels, it just works.
reply
It also works your wallet
reply
Right now Google Antigravity has free Claude Opus 4.5, with pretty decent allowances.

I also use Github Copilot which is just $10/mo. I have to use the official copilot though, if I try to 'hack it' to work in Claude Code it burns thru all the credits too fast.

I am having a LOT of great luck using Minimax M2 in Claude Code, its very cheap, and it works so good.. its close to Sonnet in Claude Code. I use this tool called cc-switch to swap out different models for Claude Code.

reply
Highly recommend Claude Max, but I also want to point out Opus 4.5 is the cheapest Opus has ever been.

(I just learned ChatGPT 5.2 Pro is $168/1mtok. Insanity.)

reply
If you pay for a Claude Max subscription it is the same price as previous models.
reply
Just wait a few months -- AI has been getting more affordable _very_ quickly
reply
I’ve felt that the LLM forgets CLAUDE.md after 4-5 messages. Then, why not reinject CLAUDE.md into the context at the fifth message?
reply
CLAUDE.md should be picked up and injected into every message you send to the model, regardless if it is 1st or 10th message in the same session.
reply
Yes. One of my system-wide instructions is “Read the Claude.md file and any readme in the current directory, then tell me how you slept.”

If Claude makes a yawn or similar, I know it’s parsed the files. It’s not been doing so the last week or so, except for once out of five times last night.

reply
The number of times I’ve written “read your own fucking Claude.md file” is a bit too numerous.

“You’re absolutely right! I see here you don’t want me to break every coding convention you have specified for me!”

reply
How long are your conversations with Claude?

I've used it pretty extensively over the year and never had issues with this.

If you hit autocompact during a chat, it's already too long. You should've exported the relevant bits to a markdown file and reset context already.

reply
for i in seq(1,100) ; do cat CLAUDE.md ; done
reply
The Attention algo does that, it has a recency bias. Your observation is not necessarily indicative of Claude not loading CLAUDE.md.

I think you may be observing context rot? How many back and forths are you into when you notice this?

reply
I know the reason, I just took the opportunity of answering to a claude dev to point out why it's no panacea and how this requires consistent context management.

Real semi-productive workflow is really a "write plans in markdowns -> new chat -> implement few things -> update plans -> new chat, etc".

reply
That explains why it happens, but doesn't really help with the problem. The expectation I have as a pretty naive user, is that what is in the .md file should be permanently in the context. It's good to understand why this is not the case, but it's unintuitive and can lead to frustration. It's bad UX, if you ask me.

I'm sure there are workarounds such as resetting the context, but the point is that god UX would mean such tricks are not needed.

reply
Yeah the current best approach to aggressively compact and recreate context by starting fresh. It’s awkward and I wish I didn’t have to.
reply
I'm surprised this hasn't been been automated yet but I'm pretty naive to the space - the problem of "when?"/"how often?" seems like a fun one to chew on
reply
I think Gemini 3 pro (high) in Antigravity does something like that because I can keep asking for different changes in the same chat without needing to create a new session.
reply
It’s not that it’s not in the context, it’s that it was injected so far back that it is deemed not so important when determining the next token.
reply