在野外去渲染3D物体

论文标题

在野外去渲染3D物体

De-rendering 3D Objects in the Wild

论文作者

Wimbauer, Felix, Wu, Shangzhe, Rupprecht, Christian

论文摘要

随着对增强和虚拟现实应用程序（XR）的越来越重视，对可以将对象和视频从图像和视频中提升为适合各种相关3D任务的表示形式的算法的需求。 XR设备和应用程序的大规模部署意味着我们不能仅依靠监督学习，因为在现实世界中为无限种类的各种对象收集和注释数据是不可行的。我们提出了一种弱监督的方法，该方法能够将对象的单个图像分解为形状（深度和正常），材料（反照率，反射率和光泽）和全局照明参数。对于培训，该方法仅依赖于训练对象的粗糙初始形状估计来引导学习过程。例如，这种形状的监督可以从验证的深度网络中进行，也可以从传统的结构中进行，从传统的结构管道中进行。在我们的实验中，我们表明该方法可以成功地将2D图像变成分解的3D表示形式，并概括为看不见的对象类别。由于由于缺乏地面真相数据，因此很难进行野外评估，因此我们还引入了一个光真实的合成测试集，可以进行定量评估。

With increasing focus on augmented and virtual reality applications (XR) comes the demand for algorithms that can lift objects from images and videos into representations that are suitable for a wide variety of related 3D tasks. Large-scale deployment of XR devices and applications means that we cannot solely rely on supervised learning, as collecting and annotating data for the unlimited variety of objects in the real world is infeasible. We present a weakly supervised method that is able to decompose a single image of an object into shape (depth and normals), material (albedo, reflectivity and shininess) and global lighting parameters. For training, the method only relies on a rough initial shape estimate of the training objects to bootstrap the learning process. This shape supervision can come for example from a pretrained depth network or - more generically - from a traditional structure-from-motion pipeline. In our experiments, we show that the method can successfully de-render 2D images into a decomposed 3D representation and generalizes to unseen object categories. Since in-the-wild evaluation is difficult due to the lack of ground truth data, we also introduce a photo-realistic synthetic test set that allows for quantitative evaluation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题