DOPS：学习检测3D对象并预测其3D形状

论文标题

DOPS：学习检测3D对象并预测其3D形状

DOPS: Learning to Detect 3D Objects and Predict their 3D Shapes

论文作者

Najibi, Mahyar, Lai, Guangda, Kundu, Abhijit, Lu, Zhichao, Rathod, Vivek, Funkhouser, Thomas, Pantofaru, Caroline, Ross, David, Davis, Larry S., Fathi, Alireza

论文摘要

我们提出了DOPS，这是一种快速的单阶段3D对象检测方法，用于LIDAR数据。以前的方法通常会做出特定于域的设计决策，例如在自主驾驶场景中将点投射到鸟眼视图图像中。相比之下，我们提出了一种适用于室内和室外场景的通用方法。我们方法的核心新颖性是一种快速，单通式的体系结构，既可以检测3D的对象并估算它们的形状。 3D边界框参数在每个点的一个通过中估算，通过图汇合进行汇总，并馈入网络的一个分支，该分支预测代表每个检测到的对象的形状的潜在代码。潜在的形状空间和形状解码器是在合成数据集上学习的，然后用作对3D对象检测管道端到端训练的监督。因此，我们的模型能够提取形状，而无需访问目标数据集中的地面形状信息。在实验过程中，我们发现我们提出的方法在扫描仪场景中的对象检测中获得了约5％的最新结果，并且在Waymo Open数据集中获得了3.4％的最高结果，同时再现了检测到的汽车的形状。

We propose DOPS, a fast single-stage 3D object detection method for LIDAR data. Previous methods often make domain-specific design decisions, for example projecting points into a bird-eye view image in autonomous driving scenarios. In contrast, we propose a general-purpose method that works on both indoor and outdoor scenes. The core novelty of our method is a fast, single-pass architecture that both detects objects in 3D and estimates their shapes. 3D bounding box parameters are estimated in one pass for every point, aggregated through graph convolutions, and fed into a branch of the network that predicts latent codes representing the shape of each detected object. The latent shape space and shape decoder are learned on a synthetic dataset and then used as supervision for the end-to-end training of the 3D object detection pipeline. Thus our model is able to extract shapes without access to ground-truth shape information in the target dataset. During experiments, we find that our proposed method achieves state-of-the-art results by ~5% on object detection in ScanNet scenes, and it gets top results by 3.4% in the Waymo Open Dataset, while reproducing the shapes of detected cars.

下载PDF全文

下载文献需遵守相关版权规定

论文标题