Then I saw someone's Show HN post for their own vibecoded programming language project, and many of the feature bullet points were the same. Maybe it was partly coincidence (all modern PLs have a fair bit of overlap), but it really gave me pause, and I mostly lost interest in the project after that.
https://arxiv.org/pdf/1607.06450
Depending on the model architecture, there is normalization taking place in multiple different places in order to save compute and ensure (some) consistency in output. Training, by its very nature, also is a normalization function, since you are telling the model which outputs are and are not valid, shaping weights that define features.