I think the case is the strongest with RLHF - if your model speaks with a distinctive “voice”, and to make it do so you had to carefully craft training data to give it that voice, such that there are obvious similarities (shared turns of speech, etc) between your RLHF training input and the model outputs - that aspect of the model likely is copyrightable. But if you are trying to improve a model’s performance at mathematics problems, then no matter how much creativity you put into choosing training data, it is unlikely identifiable creative elements from the training data survive in the model output, which suggests that creativity didn’t actually make it into the model in the sense relevant to US copyright law