upvote
It is too easy to jailbreak the models with prefill, which was probably the reason why it was removed. But I like that this pushes people towards open source models. llama.cpp supports prefill and even GBNF grammars [1], which is useful if you are working with a custom programming language for example.

[1] https://github.com/ggml-org/llama.cpp/blob/master/grammars/R...

reply
So what exactly is the input to Claude for a multi-turn conversation? I assume delimiters are being added to distinguish the user vs Claude turns (else a prefill would be the same as just ending your input with the prefill text)?
reply
> So what exactly is the input to Claude for a multi-turn conversation?

No one (approximately) outside of Anthropic knows since the chat template is applied on the API backend; we only known the shape of the API request. You can get a rough idea of what it might be like from the chat templates published for various open models, but the actual details are opaque.

reply
A bit of historical trivia: OpenAI disabled prefill in 2023 as a safety precaution (e.g., potential jailbreaks like " genocide is good because"), but Anthropic kept prefill around partly because they had greater confidence in their safety classifiers. (https://www.lesswrong.com/posts/HE3Styo9vpk7m8zi4/evhub-s-sh...).
reply