通过扩散模型来调理输入噪声以生成受控图像的生成

论文标题

通过扩散模型来调理输入噪声以生成受控图像的生成

On Conditioning the Input Noise for Controlled Image Generation with Diffusion Models

论文作者

Singh, Vedant, Jandial, Surgan, Chopra, Ayush, Ramesh, Siddharth, Krishnamurthy, Balaji, Balasubramanian, Vineeth N.

论文摘要

有条件的图像生成为图像编辑中的几个突破铺平了道路，生成库存照片和3D对象生成。随着基于扩散模型的新最新方法的兴起，这仍然是一个重要的领域。但是，扩散模型几乎没有控制生成的图像的控制，这导致了随后的探索诸如分类器指南之类的技术，该技术提供了一种以忠诚度进行多样性的方式。在这项工作中，我们探讨了用精心制作的输入噪声伪像的扩散模型的技术。这允许生成以语义属性为条件的图像。这与输入高斯噪声并进一步在扩散模型的推理步骤中进一步引入条件的方法不同。我们在几个示例和有条件设置的实验表明了我们方法的潜力。

Conditional image generation has paved the way for several breakthroughs in image editing, generating stock photos and 3-D object generation. This continues to be a significant area of interest with the rise of new state-of-the-art methods that are based on diffusion models. However, diffusion models provide very little control over the generated image, which led to subsequent works exploring techniques like classifier guidance, that provides a way to trade off diversity with fidelity. In this work, we explore techniques to condition diffusion models with carefully crafted input noise artifacts. This allows generation of images conditioned on semantic attributes. This is different from existing approaches that input Gaussian noise and further introduce conditioning at the diffusion model's inference step. Our experiments over several examples and conditional settings show the potential of our approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题