Maybe a more idealized training set could improve things, but at least for today’s SOTA, you have to get the shitty first draft out and then improve it.
Harnessing makes a difference, but it’s only shuffling around when and where the tokens get generated. It can trade being slower by doing a hidden first draft and only showing the output after doing a self review. But the models still need to generate it all explicitly.