截断的扩散概率模型和基于扩散的对抗自动编码器

论文标题

截断的扩散概率模型和基于扩散的对抗自动编码器

Truncated Diffusion Probabilistic Models and Diffusion-based Adversarial Auto-Encoders

论文作者

Zheng, Huangjie, He, Pengcheng, Chen, Weizhu, Zhou, Mingyuan

论文摘要

基于扩散的生成模型，采用正向扩散链逐渐将数据映射到噪声分布中，学习如何通过推断反向扩散链来生成数据。但是，这种方法缓慢而昂贵，因为它需要许多前进和逆转步骤。我们提出了一种更快，更便宜的方法，该方法会在数据成为纯粹的随机噪声之前添加噪声，但是直到它们达到我们可以自信地学习的隐藏噪声数据分布之前。然后，我们使用较少的反向步骤来生成数据，从与嘈杂数据相似的隐藏分布开始。我们透露，所提出的模型可以作为对抗性自动编码器的施放，该模型既可以通过扩散过程和可学习的隐式先验赋予能力。实验结果表明，即使有明显少数的反向扩散步骤，在无条件和文本引导的图像世代中，提出的截断的扩散概率模型也可以在性能方面提供对非截断的概率的一致改进。

Employing a forward diffusion chain to gradually map the data to a noise distribution, diffusion-based generative models learn how to generate the data by inferring a reverse diffusion chain. However, this approach is slow and costly because it needs many forward and reverse steps. We propose a faster and cheaper approach that adds noise not until the data become pure random noise, but until they reach a hidden noisy data distribution that we can confidently learn. Then, we use fewer reverse steps to generate data by starting from this hidden distribution that is made similar to the noisy data. We reveal that the proposed model can be cast as an adversarial auto-encoder empowered by both the diffusion process and a learnable implicit prior. Experimental results show even with a significantly smaller number of reverse diffusion steps, the proposed truncated diffusion probabilistic models can provide consistent improvements over the non-truncated ones in terms of performance in both unconditional and text-guided image generations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题