undefined

upvote

points

by epolanski55 days ago |

upvote

by roughly55 days ago|

[-]

One of the things you get an intuition for after using these systems is when to start a new conversation, and the basic rule of thumb is “always.” Use a conversation for one and only one task or question, and then start a new one. For longer projects, have the LLM write down a plan or checklist, and then have it tackle each step in a new conversation. The LLM context collapse happens well before you hit the token limits, and things like ground rules and whatnot stop influencing the LLMs outputs after a couple tens of thousands of tokens in my experience.

(Similar guidance goes for writing tools & whatnot - give the LLM exactly and only what it needs back from a tool, don’t try to make it act like a deterministic program. Whether or not they’re capital-I intelligent, they’re pretty fucking stupid.)

reply

upvote

by inquirerGeneral55 days ago|

[-]

[dead]

reply

upvote

by SV_BubbleTime55 days ago|

[-]

The number of times I’ve written “read your own fucking Claude.md file” is a bit too numerous.

“You’re absolutely right! I see here you don’t want me to break every coding convention you have specified for me!”

reply

upvote

by theshrike7954 days ago|

[-]

How long are your conversations with Claude?

I've used it pretty extensively over the year and never had issues with this.

If you hit autocompact during a chat, it's already too long. You should've exported the relevant bits to a markdown file and reset context already.

reply

upvote

by bryanrasmussen47 days ago|

[-]

how often in an hour do you export the relevant bits to a markdown file and reset context?

reply

upvote

by bcherny55 days ago|

[-]

Yeah, adherence is a hard problem. It should be feeling much better in newer models, especially Opus 4.5. I generally find that Opus listens to me the first time.

reply

upvote

by frankdenbow55 days ago|

[-]

Have been using Opus 4.5 and can confirm this is how it feels, it just works.

reply

upvote

by PufPufPuf55 days ago|

[-]

It also works your wallet

reply

upvote

by radio87955 days ago|

[-]

Right now Google Antigravity has free Claude Opus 4.5, with pretty decent allowances.

I also use Github Copilot which is just $10/mo. I have to use the official copilot though, if I try to 'hack it' to work in Claude Code it burns thru all the credits too fast.

I am having a LOT of great luck using Minimax M2 in Claude Code, its very cheap, and it works so good.. its close to Sonnet in Claude Code. I use this tool called cc-switch to swap out different models for Claude Code.

reply

upvote

by wyre55 days ago|

[-]

Highly recommend Claude Max, but I also want to point out Opus 4.5 is the cheapest Opus has ever been.

(I just learned ChatGPT 5.2 Pro is $168/1mtok. Insanity.)

reply

upvote

by fastball55 days ago|

[-]

If you pay for a Claude Max subscription it is the same price as previous models.

reply

upvote

by shepherdjerred55 days ago|

[-]

Just wait a few months -- AI has been getting more affordable _very_ quickly

reply

upvote

by hamiecod55 days ago|

[-]

I’ve felt that the LLM forgets CLAUDE.md after 4-5 messages. Then, why not reinject CLAUDE.md into the context at the fifth message?

reply

upvote

by maleta55 days ago|

[-]

CLAUDE.md should be picked up and injected into every message you send to the model, regardless if it is 1st or 10th message in the same session.

reply

upvote

by tclancy55 days ago|

[-]

Yes. One of my system-wide instructions is “Read the Claude.md file and any readme in the current directory, then tell me how you slept.”

If Claude makes a yawn or similar, I know it’s parsed the files. It’s not been doing so the last week or so, except for once out of five times last night.

reply

upvote

by dayjah55 days ago|

[-]

The Attention algo does that, it has a recency bias. Your observation is not necessarily indicative of Claude not loading CLAUDE.md.

I think you may be observing context rot? How many back and forths are you into when you notice this?

reply

upvote

by epolanski55 days ago|

[-]

I know the reason, I just took the opportunity of answering to a claude dev to point out why it's no panacea and how this requires consistent context management.

Real semi-productive workflow is really a "write plans in markdowns -> new chat -> implement few things -> update plans -> new chat, etc".

reply

upvote

by Fargren55 days ago|

[-]

That explains why it happens, but doesn't really help with the problem. The expectation I have as a pretty naive user, is that what is in the .md file should be permanently in the context. It's good to understand why this is not the case, but it's unintuitive and can lead to frustration. It's bad UX, if you ask me.

I'm sure there are workarounds such as resetting the context, but the point is that god UX would mean such tricks are not needed.

reply

upvote

by girvo55 days ago|

[-]

Yeah the current best approach to aggressively compact and recreate context by starting fresh. It’s awkward and I wish I didn’t have to.

reply

upvote

by toxic7255 days ago|

[-]

I'm surprised this hasn't been been automated yet but I'm pretty naive to the space - the problem of "when?"/"how often?" seems like a fun one to chew on

reply

upvote

by evrenesat55 days ago|

[-]

I think Gemini 3 pro (high) in Antigravity does something like that because I can keep asking for different changes in the same chat without needing to create a new session.

reply

upvote

by tacitusarc55 days ago|

[-]

It’s not that it’s not in the context, it’s that it was injected so far back that it is deemed not so important when determining the next token.

reply

upvote

by nineteen99954 days ago|

[-]

for i in seq(1,100) ; do cat CLAUDE.md ; done

reply