If the smarts came from post-training, we could show significant gains by doing that post-training again for previous generations of models. But we know that isn’t happening - effective post training is necessary but not sufficient for model performance.
reply