nerf-sos：复杂场景上的任何视图自我监督对象分割

论文标题

nerf-sos：复杂场景上的任何视图自我监督对象分割

NeRF-SOS: Any-View Self-supervised Object Segmentation on Complex Scenes

论文作者

Fan, Zhiwen, Wang, Peihao, Jiang, Yifan, Gong, Xinyu, Xu, Dejia, Wang, Zhangyang

论文摘要

神经体积表示表明，多层感知器（MLP）可以通过多视图校准图像进行优化，以表示场景几何和外观，而无需显式3D监督。对象分割可以根据学习的光芒度字段丰富许多下游应用程序。但是，引入手工制作的细分以在复杂的现实世界中定义感兴趣的区域是非平凡且昂贵的，因为它可以获取每个视图注释。本文使用NERF来探索对物体分割的自我监督学习，用于复杂的现实世界场景。我们的框架，称为NERF，具有自我监督的对象分割NERF-SOS，夫妻对象分割和神经辐射字段，以在场景中的任何视图中分割对象。通过提出新型的外观和几何水平的合作对比损失，NERF-SOS鼓励NERF模型从其密度字段和自我培训的预训练的2D视觉特征中提炼紧凑的几何形状感知分割群集。自我监督的对象分割框架可以应用于各种NERF模型，这既可以导致室内和室外场景的照片真实的渲染结果和令人信服的分割图。 LLFF，Tank和Temple和BlendenDMVS数据集的广泛结果验证了NERF-SOS的有效性。它始终超过其他基于2D的自我监督基线，并预测比现有监督对应物更好的语义面具。请参阅我们的项目页面上的视频，以获取更多详细信息：https：//zhiwenfan.github.io/nerf-sos。

Neural volumetric representations have shown the potential that Multi-layer Perceptrons (MLPs) can be optimized with multi-view calibrated images to represent scene geometry and appearance, without explicit 3D supervision. Object segmentation can enrich many downstream applications based on the learned radiance field. However, introducing hand-crafted segmentation to define regions of interest in a complex real-world scene is non-trivial and expensive as it acquires per view annotation. This paper carries out the exploration of self-supervised learning for object segmentation using NeRF for complex real-world scenes. Our framework, called NeRF with Self-supervised Object Segmentation NeRF-SOS, couples object segmentation and neural radiance field to segment objects in any view within a scene. By proposing a novel collaborative contrastive loss in both appearance and geometry levels, NeRF-SOS encourages NeRF models to distill compact geometry-aware segmentation clusters from their density fields and the self-supervised pre-trained 2D visual features. The self-supervised object segmentation framework can be applied to various NeRF models that both lead to photo-realistic rendering results and convincing segmentation maps for both indoor and outdoor scenarios. Extensive results on the LLFF, Tank & Temple, and BlendedMVS datasets validate the effectiveness of NeRF-SOS. It consistently surpasses other 2D-based self-supervised baselines and predicts finer semantics masks than existing supervised counterparts. Please refer to the video on our project page for more details:https://zhiwenfan.github.io/NeRF-SOS.

下载PDF全文

下载文献需遵守相关版权规定

论文标题