undefined

points

[-]

I think this is a fundamental LLM issue. I recall a paper a ways back about trying to get the LLMs to be too succinct, and the problem is, with the way they are implemented, the only way they can "think" is to emit a token. IIRC it demonstrated that even when the model is just babbling something like "Yeah, let's take a look at the issue you just raised" that under the hood, even though that output was superficially useless, it was also changing its state in ways related to solving the problem and not just outputting that superficially useless text.

It helps to understand that, because then you can also not be annoyed by things like "Let's do X. No, wait, X has this problem, let's do Y instead." You might think to yourself, if X was a bad idea, couldn't it have considered X and rejected it without outputting a token?" and the answer is, that sentence was it considering X and rejecting it, and no, there is no way for it to do that and not emit tokens. Thinking is inextricably tied to output for LLMs.

There is even some fairly substantial evidence from a couple of different angles that the thinking output is only somewhat loosely correlated to what the model is "actually" doing.

Token efficiency is an interesting question to ponder and it is something to worry about that the providers have incentives to be flabby with their tokens when you're paying per token, but the question is certainly not as easy as just trying to get the models to be "more succinct" in general.

I often discuss a "next gen" AI architecture after LLMs and I anticipate one of the differences it will have is the ability to think without also having to output anything. LLMs are really nifty but they store too much of their "state" in their own output. As a human being, while I find like many other people that if I'm doing deep thinking on a topic it helps to write stuff down, it certainly isn't necessary for me to continuously output things in order to think about things, and if anything I'm on the "absent minded"/"scatterbrained" side... if I'm storing a lot of my state in my output for the past couple of hours then it sure isn't terribly accessible to my conscious mind when I do things like open the pantry door only to totally forget the reason I had for opening it between having that reason and walking to the pantry.

by joquarky13 hours ago|

prev|

[-]

There might be a reason it works that way.

https://en.wiktionary.org/wiki/Chesterton%27s_fence

by verdverm16 hours ago|

prev|

[-]

iteration and co-authoring is the strategy I've settled on