undefined

points

[-]

i've never had to use control + o before but with the latest changes, i give Opus a simple task that should take a few seconds and it's like "used 15k tokens" and "thinking" for three minutes with absolutely zero indication or visibility as to what it's actually doing and i have to ESC ESC it to stop and ask what the FUCK are you actually doing claude?

by misnome5 hours ago|

parent|

[-]

Yes, I’ve been evaluating since the start of the year and since 4.6 suddenly the most innocuous requests will sit there “thinking” for 5+ minutes and if I can get it to show me the thinking it’s just going round in circles.

Or, it decided it needs to get API documentation out and spends tens of thousands of tokens fetching every file in a repo with separate tool use instead of reading the documentation.

Profitable, if you are charging for token usage, I suspect.

But I’m reaching the point where I can’t recommend claude to people who are interesting in skeptically trying it out, because of the default model.

by scottyah5 hours ago|

parent|

prev|

[-]

Yeah after my switch to Opus 4.6 I noticed a lot of this. I've been wary that eventually models are going to optimize for token usage increases, since that's how the company makes money. I told it to read the files in my directory (4 files, longest was like 380 lines) and caught it using 14 tool uses- including head -n 20 and tail -n 20 on a file. Definitely a what are you doing moment.

by misnome3 hours ago|

parent|

[-]

OTOH I find it pretty funny that the instant they manage to make a model that breaks general containment of popularity and usefulness (4.5), the toxicity of the business model kicks in and they instantly enshittify.

by 8note2 hours ago|

parent|

prev|

[-]

i think yesterday it ate the whole context window in one thinking call.

i bet in a week itll eat the whole 5hour throttle in one call too:P

by virtue35 hours ago|

parent|

prev|

[-]

I think this change is really disingenuous.

If they hide how the tool is accessing files (aka using tokens) and then charging us per token - how are we able to track loosely what our spend is?

I’m all for simplification of the UX. But when it’s helping to hide the main spend it feels shitty.