Alibaba didn't steal Opus weights, they used opus output to train their model.
If this is piracy, then so is reverse engineering efforts powering a bunch of Linux drivers.
Also, yeah, they already stole their copyrighted works, so a thief from a thief is still...theives?
If you have ethical concerns, model distillation feels like an arbitrary line to draw. Why is the first type of piracy ok, the second not? You should restrict yourself to ethical open source models. Which is btw where I genuinely hope the future of local models is going to lie. Open weights is not enough, we need fully open source models to be sustainable. Even for simple things like updating the knowledge cutoff. How we are going to distribute the training effort will be an interesting problem where I don't see an obvious solution yet. Maybe the blockchain/federated learning people can suggest something. Or university consortia, or some public sector solutions. Or something really boring - I for one would absolutely be willing to pay for DRM-free weights of an open source model (even if I could pirate them for free).
Btw, ethically sourced, open source LLMs exist! Check out eg Olmo by Allen AI: https://allenai.org/olmo