脉搏：通过生成模型的潜在空间探索自我监管的照片提升采样

论文标题

脉搏：通过生成模型的潜在空间探索自我监管的照片提升采样

PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models

论文作者

Menon, Sachit, Damian, Alexandru, Hu, Shijia, Ravi, Nikhil, Rudin, Cynthia

论文摘要

单像超分辨率的主要目的是从相应的低分辨率（LR）输入中构造高分辨率（HR）图像。在通常已经监督的先前方法中，训练目标通常测量超级分辨（SR）和HR图像之间的平均距离。优化此类指标通常会导致模糊，尤其是在高方差（详细）区域。我们提出了一个基于正确降低尺寸的现实SR图像的超分辨率问题的替代表述。我们提出了一种解决此问题的算法，该算法（通过潜在空间探索进行脉冲采样），该算法在文献中以前看不见的分辨率上产生了高分辨率，逼真的图像。它以一种完全自我监督的方式实现了这一点，并且不仅限于培训期间使用的特定降解操作员，这与以前的方法不同（需要在LR-HR图像对数据库上进行监督培训）。脉冲没有从LR图像开始并慢慢添加细节，而是遍历高分辨率自然图像歧管，而是搜索降级到原始LR图像的图像。这是通过“降尺度损失”形式化的，它可以通过生成模型的潜在空间进行探索。通过利用高维高斯人的属性，我们限制了搜索空间以确保实际产出。脉冲因此，生成了正确且正确尺寸的超级分辨图像。我们在面部超分辨率领域（即面部幻觉）展示了我们方法的概念证明。我们还提出了有关该方法的局限性和偏见的讨论，该方法当前使用带有相关指标的随附模型卡实现。在更高的分辨率和规模因素下，我们的方法比以前可能更高。

The primary aim of single-image super-resolution is to construct high-resolution (HR) images from corresponding low-resolution (LR) inputs. In previous approaches, which have generally been supervised, the training objective typically measures a pixel-wise average distance between the super-resolved (SR) and HR images. Optimizing such metrics often leads to blurring, especially in high variance (detailed) regions. We propose an alternative formulation of the super-resolution problem based on creating realistic SR images that downscale correctly. We present an algorithm addressing this problem, PULSE (Photo Upsampling via Latent Space Exploration), which generates high-resolution, realistic images at resolutions previously unseen in the literature. It accomplishes this in an entirely self-supervised fashion and is not confined to a specific degradation operator used during training, unlike previous methods (which require supervised training on databases of LR-HR image pairs). Instead of starting with the LR image and slowly adding detail, PULSE traverses the high-resolution natural image manifold, searching for images that downscale to the original LR image. This is formalized through the "downscaling loss," which guides exploration through the latent space of a generative model. By leveraging properties of high-dimensional Gaussians, we restrict the search space to guarantee realistic outputs. PULSE thereby generates super-resolved images that both are realistic and downscale correctly. We show proof of concept of our approach in the domain of face super-resolution (i.e., face hallucination). We also present a discussion of the limitations and biases of the method as currently implemented with an accompanying model card with relevant metrics. Our method outperforms state-of-the-art methods in perceptual quality at higher resolutions and scale factors than previously possible.

下载PDF全文

下载文献需遵守相关版权规定

论文标题