位置编码为gan的空间电感偏差

论文标题

位置编码为gan的空间电感偏差

Positional Encoding as Spatial Inductive Bias in GANs

论文作者

Xu, Rui, Wang, Xintao, Chen, Kai, Zhou, Bolei, Loy, Chen Change

论文摘要

尽管有效的接收场有限，但Singan在学习内部斑块分布方面表现出令人印象深刻的能力。我们有兴趣知道这种翻译不变的卷积发生器如何仅通过空间上捕获全球结构。输入。在这项工作中，以Singan和stylegan2为例，我们表明，在发电机中使用零填充物时，这种能力在很大程度上是由隐式位置编码带来的。这种位置编码对于生成具有高保真度的图像是必不可少的。在诸如DCGAN和PGGAN等其他生成架构中也观察到了相同的现象。我们进一步表明，零填充会导致空间偏见不平衡，位置之间有模糊的关系。为了提供更好的空间归纳偏见，我们研究了替代位置编码并分析其效果。根据更灵活的位置编码，我们提出了一种新的多尺度培训策略，并证明了其在最新的无条件发电机样式中的有效性。此外，明确的空间电感偏置可实质上改善了Singan，以进行更通用的图像操纵。

SinGAN shows impressive capability in learning internal patch distribution despite its limited effective receptive field. We are interested in knowing how such a translation-invariant convolutional generator could capture the global structure with just a spatially i.i.d. input. In this work, taking SinGAN and StyleGAN2 as examples, we show that such capability, to a large extent, is brought by the implicit positional encoding when using zero padding in the generators. Such positional encoding is indispensable for generating images with high fidelity. The same phenomenon is observed in other generative architectures such as DCGAN and PGGAN. We further show that zero padding leads to an unbalanced spatial bias with a vague relation between locations. To offer a better spatial inductive bias, we investigate alternative positional encodings and analyze their effects. Based on a more flexible positional encoding explicitly, we propose a new multi-scale training strategy and demonstrate its effectiveness in the state-of-the-art unconditional generator StyleGAN2. Besides, the explicit spatial inductive bias substantially improve SinGAN for more versatile image manipulation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题