姿势指导的高分辨率出现通过渐进式训练

论文标题

姿势指导的高分辨率出现通过渐进式训练

Pose-Guided High-Resolution Appearance Transfer via Progressive Training

论文作者

Liu, Ji, Liu, Heshan, Chiu, Mang-Tik, Tai, Yu-Wing, Tang, Chi-Keung

论文摘要

我们提出了一个新型的姿势引导的外观转移网络，用于将给定的参考外观转移到前所未有的图像分辨率（1024 * 1024）中，分别给定参考和目标人的图像。不使用3D模型。取而代之的是，我们的网络利用密集的本地描述符，包括本地感知损失和本地歧视者来完善细节，以粗糙到细节的方式逐步训练了细节，以产生高分辨率的输出，以忠实地保留服装纹理和几何形状的复杂外观，同时无缝地幻觉，包括隔离的外观。我们的渐进编码器架构可以在多个尺度上学习输入图像中固有的参考外观。从YouTube收集的Human36M数据集，DeepFashion数据集和我们的数据集的广泛实验结果表明，我们的模型会产生高质量的图像，这些图像可以进一步用于有用的应用中，例如人物和姿势引导的人类视频生成等有用的应用程序。

We propose a novel pose-guided appearance transfer network for transferring a given reference appearance to a target pose in unprecedented image resolution (1024 * 1024), given respectively an image of the reference and target person. No 3D model is used. Instead, our network utilizes dense local descriptors including local perceptual loss and local discriminators to refine details, which is trained progressively in a coarse-to-fine manner to produce the high-resolution output to faithfully preserve complex appearance of garment textures and geometry, while hallucinating seamlessly the transferred appearances including those with dis-occlusion. Our progressive encoder-decoder architecture can learn the reference appearance inherent in the input image at multiple scales. Extensive experimental results on the Human3.6M dataset, the DeepFashion dataset, and our dataset collected from YouTube show that our model produces high-quality images, which can be further utilized in useful applications such as garment transfer between people and pose-guided human video generation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题