This is interesting point. I think lot of people assume that training gathers new "metadata" based on the original data and ignore what the training optimises for which is direct copy of the input. Training a model results in a fancy copy paste (unless incentivised differently).