用渠道意识来解释有条件的甘斯

论文标题

用渠道意识来解释有条件的甘斯

Interpreting Class Conditional GANs with Channel Awareness

论文作者

He, Yingqing, Zhang, Zhiyi, Zhu, Jiapeng, Shen, Yujun, Chen, Qifeng

论文摘要

了解生成对抗网络（GAN）的机制有助于我们更好地将gans用于下游应用。现有的努力主要针对解释无条件的模型，而探讨了有条件的gan如何学会渲染各种类别的图像。这项工作通过研究了类别生成器如何统一多个类别的合成，从而填补了这一空白。为此，我们深入研究了广泛使用的类条件分批归一化（CCBN），并观察到给定不同分类嵌入的每个特征通道以不同程度激活。为了描述这种现象，我们提出了通道意识，该通道意识在定量地表征了单个通道如何有助于最终合成。对ImageNet预先训练的BigGAN模型进行了广泛的评估和分析表明，只有一部分渠道主要负责特定类别的产生，相似类别（例如，CAT和DOG）通常与某些渠道相关，并且某些渠道可以在所有类中共享信息。为了良好的衡量，我们的算法可以通过条件gan进行多种新颖的应用。具体而言，我们通过简单地更改单个通道并设法和谐地杂交两个不同的类来实现（1）多功能图像编辑。我们进一步验证了提出的通道意识在（3）分割合成图像和（4）评估类别合成性能的情况下显示出有希望的潜力。

Understanding the mechanism of generative adversarial networks (GANs) helps us better use GANs for downstream applications. Existing efforts mainly target interpreting unconditional models, leaving it less explored how a conditional GAN learns to render images regarding various categories. This work fills in this gap by investigating how a class conditional generator unifies the synthesis of multiple classes. For this purpose, we dive into the widely used class-conditional batch normalization (CCBN), and observe that each feature channel is activated at varying degrees given different categorical embeddings. To describe such a phenomenon, we propose channel awareness, which quantitatively characterizes how a single channel contributes to the final synthesis. Extensive evaluations and analyses on the BigGAN model pre-trained on ImageNet reveal that only a subset of channels is primarily responsible for the generation of a particular category, similar categories (e.g., cat and dog) usually get related to some same channels, and some channels turn out to share information across all classes. For good measure, our algorithm enables several novel applications with conditional GANs. Concretely, we achieve (1) versatile image editing via simply altering a single channel and manage to (2) harmoniously hybridize two different classes. We further verify that the proposed channel awareness shows promising potential in (3) segmenting the synthesized image and (4) evaluating the category-wise synthesis performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题