undefined

points

by bensyverson16 hours ago |

comments

by andai9 hours ago|

[-]

Does this imply that if you train it on Gwern style output, the quality will improve?

by gwern8 hours ago|

parent|

[-]

Unfortunately, that is an oversimplification for a highly RLed/chatbot trained LLM like Claude-4.7-opus. It may have started life as a base model (where prompting it with correctly spelled prompts, or text from 'gwern', would - and did with davinci GPT-3! - improve quality), but that was eons ago. The chatbots are largely invariant to that kind of prompt trickery, and just try to do their best every time. This is why those meme tricks about tips or bribery or my-grandmother-will-die stop working.