从奇点掌握时间的箭头：在stylegan的低维潜在空间中解码微动

论文标题

从奇点掌握时间的箭头：在stylegan的低维潜在空间中解码微动

Grasping the Arrow of Time from the Singularity: Decoding Micromotion in Low-dimensional Latent Spaces from StyleGAN

论文作者

Wu, Qiucheng, Jiang, Yifan, Wu, Junru, Wang, Kai, Zhang, Gong, Shi, Humphrey, Wang, Zhangyang, Chang, Shiyu

论文摘要

Stylegan潜在空间的解开为现实且可控的图像编辑铺平了道路，但是StyleGan对时间运动有任何了解，因为它仅在静态图像上训练？ To study the motion features in the latent space of StyleGAN, in this paper, we hypothesize and demonstrate that a series of meaningful, natural, and versatile small, local movements (referred to as "micromotion", such as expression, head movement, and aging effect) can be represented in low-rank spaces extracted from the latent space of a conventionally pre-trained StyleGAN-v2 model for face generation, with the guidance of proper "anchors" in the form of either简短的文字或视频剪辑。从一个目标面图像开始，从低级别空间解码的编辑方向，其微功能特征可以像在其潜在特征上的仿射转换一样简单。也许更令人惊讶的是，这种微功能子空间，甚至是从单个目标面中学到的，也可以无痛地转移到其他看不见的面部图像中，甚至是来自巨大不同域（例如油画，卡通和雕塑面）的图像。它表明，对应于一种类型的微型动力的局部特征几何形状在不同的面部受试者中对齐，因此StyleGAN-V2确实“秘密地”意识到由该微动作引起的主题介绍性特征变化。我们提供了各种成功的示例，这些示例将我们的低维度微功能子空间技术直接和轻松地操纵面部，显示出高鲁棒性，低计算开销和令人印象深刻的域传递性。我们的代码可从https://github.com/wuqiuche/micromotion-stylegan获得。

The disentanglement of StyleGAN latent space has paved the way for realistic and controllable image editing, but does StyleGAN know anything about temporal motion, as it was only trained on static images? To study the motion features in the latent space of StyleGAN, in this paper, we hypothesize and demonstrate that a series of meaningful, natural, and versatile small, local movements (referred to as "micromotion", such as expression, head movement, and aging effect) can be represented in low-rank spaces extracted from the latent space of a conventionally pre-trained StyleGAN-v2 model for face generation, with the guidance of proper "anchors" in the form of either short text or video clips. Starting from one target face image, with the editing direction decoded from the low-rank space, its micromotion features can be represented as simple as an affine transformation over its latent feature. Perhaps more surprisingly, such micromotion subspace, even learned from just single target face, can be painlessly transferred to other unseen face images, even those from vastly different domains (such as oil painting, cartoon, and sculpture faces). It demonstrates that the local feature geometry corresponding to one type of micromotion is aligned across different face subjects, and hence that StyleGAN-v2 is indeed "secretly" aware of the subject-disentangled feature variations caused by that micromotion. We present various successful examples of applying our low-dimensional micromotion subspace technique to directly and effortlessly manipulate faces, showing high robustness, low computational overhead, and impressive domain transferability. Our codes are available at https://github.com/wuqiuche/micromotion-StyleGAN.

下载PDF全文

下载文献需遵守相关版权规定

论文标题