undefined

I read it as a models performance being random and observed differences in the opinions are the results of the overinterpretation of the random outcomes.

I think however that some people seem to be always lucky which indicates that it is not random but rather some fixed differences between people and their environments.

by embedding-shape10 hours ago|

prev|

[-]

> I've barely seeded the context at that point

I think that's issue, rather than 60K being small.

Most of the actual edits/changes I request to codex are solved within 100-150K tokens, beyond 200K I'd definitively try to restart the session as soon as I could as all models are horrible once you get across ~20% of the total context size. And this is while working on +million LOC codebases.

Problem I guess is that there is no solid and concrete evidence of this (to me [and others seemingly] obvious) degradation, but should be easy to prove, yet no one has time to sit down and show it :)

But the likelihood of a model getting minor details wrong once you're above some magical threshold between 15-20%, seems to skyrocket, and I hit that issue sufficient amount of times that now my workflow is trying to prevent that.

by rtpg7 hours ago|

prev|

[-]

what are y'all doing to hit that? Do you just not give it any pointers and let it churn away? What kind of context are you handing off?

I routinely get claude to do things pretty decently and finish up easily in the 4-5 digit range of tokens. It seems to be doing the right kind of thing to not waste its time looking at 1000 files.