upvote
>This isn’t accurate - most of the style comes from the fine tuning and reinforcement learning, not from the original training data.

Fine tuning, reinforcement, etc are all 'training' in my books. Perhaps this is your confusion over 'people got this idea'

reply
> Fine tuning, reinforcement, etc are all 'training' in my books.

They are but they have nothing to do with how frequent anything is in literature which was your main point.

reply
Agreed. The pre-2025 base models don't write like this.
reply
So LLMs have gotten creativity recently?
reply
No, my point has nothing to do with creativity. It's about the fact that their output is taylored to look and sound in a certain way in the later stages of model training, it's not representative of the original text data the base model was trained on.
reply