undefined

points

[-]

Good question. I actually have a technical answer, believe it or not.

Pre-training is: training a model from scratch on cheap data that sets the foundation of a model's capabilities. It produces a base model.

Post-training is: training a base model further, using expensive specialized data, direct human input and elaborate high compute use methods to refine the model's behavior, and imbue it with the capabilities that pre-training alone has failed to teach it. It produces the model that's actually deployed.

When people perform distillation attacks, they take an existing base model and try to post-train it using the outputs of another proprietary model.

They're not aiming to imitate the cheap bulk pre-training data - they're aiming to imitate the expensive in-house post-training steps. One that the frontier labs have spent a lot of AI-specialized data, compute, labor and hours of R&D work on.

This is probably not "fair use", because it directly tries to take and replicate a frontier lab's competitive edge, but that wasn't tested in courts. And a lot of the companies caught doing that for their own commercial models are in China. So the path to legal recourse is shaky at best. But what's on the table is restricting access to full chain of thought, and banning the suspected distillation attackers from the inference API. Which is a bit like trying to stop a sieve from leaking - but it may slow the competitors down at least.