迈向可控图像生成的神经图形管道

论文标题

迈向可控图像生成的神经图形管道

Towards a Neural Graphics Pipeline for Controllable Image Generation

论文作者

Chen, Xuelin, Cohen-Or, Daniel, Chen, Baoquan, Mitra, Niloy J.

论文摘要

在本文中，我们利用神经网络的进步来形成可控图像生成的神经渲染，从而绕开在常规图形管道中对详细建模的需求。为此，我们提出了神经图形管道（NGP），这是一种混合生成模型，汇集了神经和传统图像形成模型。 NGP将图像分解为一组可解释的外观特征图，从而发现可控图像生成的直接控制手柄。为了形成图像，NGP生成了粗3D模型，这些模型被馈入神经渲染模块以产生特定于视图的可解释的2D地图，然后使用传统的图像形成模型将其合成到最终输出图像中。除了控制形状和外观变化外，我们的方法通过提供直接控制照明和相机参数来控制对图像生成的控制。关键挑战是通过无监督的训练来学习这些控件，该培训将粗糙的3D模型与未配对的真实图像联系起来，通过神经和传统（例如Blinn-Phong）渲染功能，而无需建立它们之间的明确对应关系。我们证明了方法对可控图像生成单对象场景的有效性。我们评估了混合建模框架，与仅神经生成方法（即DCGAN，LSGAN，WGAN-GP，VON和SRNS）进行了比较，报告了针对真实图像的FID分数的改进，并证明NGP支持在传统呈现中常见的直接控制。代码可从http://geometry.cs.ucl.ac.uk/projects/2021/ngp获得。

In this paper, we leverage advances in neural networks towards forming a neural rendering for controllable image generation, and thereby bypassing the need for detailed modeling in conventional graphics pipeline. To this end, we present Neural Graphics Pipeline (NGP), a hybrid generative model that brings together neural and traditional image formation models. NGP decomposes the image into a set of interpretable appearance feature maps, uncovering direct control handles for controllable image generation. To form an image, NGP generates coarse 3D models that are fed into neural rendering modules to produce view-specific interpretable 2D maps, which are then composited into the final output image using a traditional image formation model. Our approach offers control over image generation by providing direct handles controlling illumination and camera parameters, in addition to control over shape and appearance variations. The key challenge is to learn these controls through unsupervised training that links generated coarse 3D models with unpaired real images via neural and traditional (e.g., Blinn- Phong) rendering functions, without establishing an explicit correspondence between them. We demonstrate the effectiveness of our approach on controllable image generation of single-object scenes. We evaluate our hybrid modeling framework, compare with neural-only generation methods (namely, DCGAN, LSGAN, WGAN-GP, VON, and SRNs), report improvement in FID scores against real images, and demonstrate that NGP supports direct controls common in traditional forward rendering. Code is available at http://geometry.cs.ucl.ac.uk/projects/2021/ngp.

下载PDF全文

下载文献需遵守相关版权规定

论文标题