未配对图像到图像翻译的最大空间扰动一致性

论文标题

未配对图像到图像翻译的最大空间扰动一致性

Maximum Spatial Perturbation Consistency for Unpaired Image-to-Image Translation

论文作者

Xu, Yanwu, Xie, Shaoan, Wu, Wenhao, Zhang, Kun, Gong, Mingming, Batmanghelich, Kayhan

论文摘要

未配对的图像到图像翻译（I2i）是一个问题，因为无限数量的翻译功能可以将源域分布映射到目标分布。因此，在设计合适的约束方面已付出了很多努力，例如循环一致性（Cyclegan），几何一致性（GCGAN）和基于学习的基于学习的约束（Cutgan），这有助于更好地解决问题。但是，这些众所周知的约束存在局限性：（1）对于特定的I2i任务而言，它们要么太限制或太弱；（2）当源和目标域之间存在显着的空间变化时，这些方法会导致内容失真。本文提出了一种称为最大空间扰动一致性（MSPC）的通用正则化技术，该技术执行空间扰动函数（T）和翻译操作员（G）是可交换的（即TG = GT）。此外，我们介绍了两个对抗性训练组件，以学习空间扰动函数。第一个让T与G竞争以达到最大的扰动。第二个使G和T与歧视器竞争，以使对象大小，对象失真，背景中断等变化引起的空间变化对齐。我们的方法在大多数I2I基准测试上都优于最新方法。我们还引入了一个新的基准测试，即面向脸部数据集的正面面，以强调I2i对现实世界应用的潜在挑战。我们最终进行了消融实验，以研究我们方法对空间扰动严重程度及其分布比对的有效性的敏感性。

Unpaired image-to-image translation (I2I) is an ill-posed problem, as an infinite number of translation functions can map the source domain distribution to the target distribution. Therefore, much effort has been put into designing suitable constraints, e.g., cycle consistency (CycleGAN), geometry consistency (GCGAN), and contrastive learning-based constraints (CUTGAN), that help better pose the problem. However, these well-known constraints have limitations: (1) they are either too restrictive or too weak for specific I2I tasks; (2) these methods result in content distortion when there is a significant spatial variation between the source and target domains. This paper proposes a universal regularization technique called maximum spatial perturbation consistency (MSPC), which enforces a spatial perturbation function (T ) and the translation operator (G) to be commutative (i.e., TG = GT ). In addition, we introduce two adversarial training components for learning the spatial perturbation function. The first one lets T compete with G to achieve maximum perturbation. The second one lets G and T compete with discriminators to align the spatial variations caused by the change of object size, object distortion, background interruptions, etc. Our method outperforms the state-of-the-art methods on most I2I benchmarks. We also introduce a new benchmark, namely the front face to profile face dataset, to emphasize the underlying challenges of I2I for real-world applications. We finally perform ablation experiments to study the sensitivity of our method to the severity of spatial perturbation and its effectiveness for distribution alignment.

下载PDF全文

下载文献需遵守相关版权规定

论文标题