upvote
Thats the thing about a normalization system, it is going to normalize outputs because its not built to output uniqueness, its to winnow uniqueness to a baseline. That is good in some instances, assuming that baseline is correct, but it also closes the aperture of human expression.
reply
I agree in a "the purpose of a system is what it does" sense but I'm not sure they're inherently normalization systems.
reply
Token selection is based off normalization, even if you train a model to produce outlier answers, even in that process you are biasing to a subset of outliers, which is inherently normalizing.
reply
Could you elaborate on "token selection is based off normalization"?
reply
Sure;

https://arxiv.org/pdf/1607.06450

Depending on the model architecture, there is normalization taking place in multiple different places in order to save compute and ensure (some) consistency in output. Training, by its very nature, also is a normalization function, since you are telling the model which outputs are and are not valid, shaping weights that define features.

reply