undefined

points

[-]

Most LLMs output a whole bunch of tokens to help them reason through a problem, often called chain of thought, before giving the actual response. This has been shown to improve performance a lot but uses a lot of tokens

by zozbot2346 hours ago|

parent|

[-]

Yup, they all need to do this in case you're asking them a really hard question like: "I really need to get my car washed, the car wash place is only 50 meters away, should I drive there or walk?"

by jcims6 hours ago|

prev|

[-]

One very specific and limited example, when asked to build something 4.6 seems to do more web searches in the domain to gather latest best practices for various components/features before planning/implementing.

by andrewchilds6 hours ago|

prev|

[-]

I've found that Opus 4.6 is happy to read a significant amount of the codebase in preparation to do something, whereas Opus 4.5 tends to be much more efficient and targeted about pulling in relevant context.

by OtomotO6 hours ago|

parent|

[-]

And way faster too!

by Gracana5 hours ago|

prev|

[-]

They're talking about output consuming from the pool of tokens allowed by the subscription plan.

by bsamuels6 hours ago|

prev|

[-]

thinking tokens, output tokens, etc. Being more clever about file reads/tool calling.