upvote
It's why it starts with "You're absolutely right!" It's not to flatter the user. It's a cheap way to guide the response in a space where it's utilizing the correction.
reply
People have researched pause tokens for this exact reason.
reply
What do you think chain of thought reasoning is doing exactly?
reply
You’re conflating training and inference
reply