undefined

points

by s3p5 hours ago |

comments

by ceroxylon49 minutes ago|

[-]

I once saw "now that I've slept on it" in Gemini's CoT... baffling.

by raducu1 hours ago|

prev|

[-]

> Don't get me started on the thinking tokens.

Claude provides nicer explanations, but when it comes to CoT tokens or just prompting the LLM to explain -- I'm very skeptical of the truthfulness of it.

Not because the LLM lies, but because humans do that also -- when asked how the figured something, they'll provide a reasonable sounding chain of thought, but it's not how they figured it out.

by foz5 hours ago|

prev|

[-]

This is part of the reason I don't like to use it. I feel it's hiding things from me, compared to other models that very clearly share what they are thinking.

by dumpsterdiver3 hours ago|

parent|

[-]

To be fair, considering that the CoT exposed to users is a sanitized summary of the path traversal - one could argue that sanitized CoT is closer to hiding things than simply omitting it entirely.

by mikestorrent2 hours ago|

parent|

[-]

This is something that bothers me. We had a beautiful trend on the Web of the browser also being the debugger - from View Source decades ago all the way up to the modern browser console inspired by Firebug. Everything was visible, under the hood, if you cared to look. Now, a lot of "thinking" is taking place under a shroud, and only so much of it can be expanded for visibility and insight into the process. Where is the option to see the entire prompt that my agent compiled and sent off, raw? Where's the option to see the output, replete with thinking blocks and other markup?

by fragmede1 hours ago|

parent|

[-]

If that's what you're after, tou MITM it and setup a proxy so Claude Code or whatever sends to your program, and then that program forwards it to Anthropics's server (or whomever). That way, you get everything.

by mikestorrent9 minutes ago|

parent|

[-]

I'm aware that this is possible, and thank you for the suggestion, but surely you can see that it's a relatively large lift; may not work in controlled enterprise environments; and compared to just right click -> view source it's basically inaccessible to anyone who might have wanted to dabble.

by dist-epoch5 hours ago|

prev|

[-]

That's not the real thinking, it's a super summarized view of it.