论文标题
探索基于梯度的多向控制
Exploring Gradient-based Multi-directional Controls in GANs
论文作者
论文摘要
生成对抗网络(GAN)已广泛应用于建模各种图像分布。然而,尽管应用了令人印象深刻的应用,但甘恩(Gans)中潜在空间的结构在很大程度上仍然是一个黑框,使其可控的一代问题是一个开放的问题,尤其是当图像分布中存在不同语义属性之间的虚假相关性时。为了解决此问题,以前的方法通常会学习控制图像空间中语义属性的线性方向或单个通道。但是,它们通常会遭受不完美的分解,或者无法获得多向控制。在这项工作中,鉴于上述挑战,我们提出了一种新的方法,可以发现非线性控件,从而基于学识渊博的gan潜在空间中的梯度信息,实现了多个方向的操作以及有效的分解。更具体地说,我们首先通过从属性分开训练的分类网络中学习插值方向,然后通过专门控制针对目标属性在学习的方向上激活目标属性的通道来导航潜在空间。从经验上讲,借助小型培训数据,我们的方法能够对各种双向和多方向属性进行细粒度的控制,并且我们展示了其实现跨性别方法的能力,既明显地均优于既定和定量。
Generative Adversarial Networks (GANs) have been widely applied in modeling diverse image distributions. However, despite its impressive applications, the structure of the latent space in GANs largely remains as a black-box, leaving its controllable generation an open problem, especially when spurious correlations between different semantic attributes exist in the image distributions. To address this problem, previous methods typically learn linear directions or individual channels that control semantic attributes in the image space. However, they often suffer from imperfect disentanglement, or are unable to obtain multi-directional controls. In this work, in light of the above challenges, we propose a novel approach that discovers nonlinear controls, which enables multi-directional manipulation as well as effective disentanglement, based on gradient information in the learned GAN latent space. More specifically, we first learn interpolation directions by following the gradients from classification networks trained separately on the attributes, and then navigate the latent space by exclusively controlling channels activated for the target attribute in the learned directions. Empirically, with small training data, our approach is able to gain fine-grained controls over a diverse set of bi-directional and multi-directional attributes, and we showcase its ability to achieve disentanglement significantly better than state-of-the-art methods both qualitatively and quantitatively.