But reimplementing that isn't impressive, because its not a clean room implementation if you trained on that data, to make the model that regurgitates the effort.
Are you sure about that? Do you have some examples? The older Claude models can’t do it according to TFA.