通过概念蒸馏与生成图像粉底模型的概念蒸馏的前景 - 背景分离

论文标题

通过概念蒸馏与生成图像粉底模型的概念蒸馏的前景 - 背景分离

Foreground-Background Separation through Concept Distillation from Generative Image Foundation Models

论文作者

Dombrowski, Mischa, Reynaud, Hadrien, Baugh, Matthew, Kainz, Bernhard

论文摘要

策划对象分割的数据集是一项艰巨的任务。随着大规模预训练的生成模型的出现，有条件的图像产生已得到显着提高结果质量和易用性。在本文中，我们提出了一种新颖的方法，该方法能够从简单的文本描述中产生一般的前景 - 背景分割模型，而无需分割标签。我们利用并探索预训练的潜在扩散模型，自动为概念和对象生成弱分段掩码。然后，将掩码用于在介入任务上微调扩散模型，该模型可以对对象进行细粒度的去除，同时提供合成的前景和背景数据集。我们证明，使用此方法在歧视性和生成性能中都击败了以前的方法，并通过完全监督的训练来缩小差距，同时不需要像素对象标签。我们显示了分割四个不同物体（人类，狗，汽车，鸟）和医学图像分析中的用例场景的任务。该代码可从https://github.com/mischad/fobadiffusion获得。

Curating datasets for object segmentation is a difficult task. With the advent of large-scale pre-trained generative models, conditional image generation has been given a significant boost in result quality and ease of use. In this paper, we present a novel method that enables the generation of general foreground-background segmentation models from simple textual descriptions, without requiring segmentation labels. We leverage and explore pre-trained latent diffusion models, to automatically generate weak segmentation masks for concepts and objects. The masks are then used to fine-tune the diffusion model on an inpainting task, which enables fine-grained removal of the object, while at the same time providing a synthetic foreground and background dataset. We demonstrate that using this method beats previous methods in both discriminative and generative performance and closes the gap with fully supervised training while requiring no pixel-wise object labels. We show results on the task of segmenting four different objects (humans, dogs, cars, birds) and a use case scenario in medical image analysis. The code is available at https://github.com/MischaD/fobadiffusion.

下载PDF全文

下载文献需遵守相关版权规定

论文标题