upvote
The reason I mentioned "purely autoregressive" is that realistically I expect hybrid diffusion + autoregressive models to be the first popular diffusion models. I could be wrong though. And diffusion models have other tricks like really easy integration with simple classifiers.

Check out this paper where they use diffusion during inference on the autoencoded prediction of an autoregressive model: https://openreview.net/forum?id=c05qIG1Z2B

reply