对象感知的质心投票单程3D对象检测

论文标题

对象感知的质心投票单程3D对象检测

Object-Aware Centroid Voting for Monocular 3D Object Detection

论文作者

Bao, Wentao, Yu, Qi, Kong, Yu

论文摘要

单眼3D对象检测旨在从单个相机中检测3D物理世界中的对象。但是，最近的方法要么依赖昂贵的激光雷达设备，要么诉诸于密集的像素深度估计，从而导致过度的计算成本。在本文中，我们提出了一个端到端可训练的单眼3D对象检测器，而无需学习密集的深度。具体而言，首先将2D盒的网格坐标投影回3D空间，用针孔模型作为3D质心建议。然后，引入了一种新颖的对象感知的投票方法，该方法考虑了区域外观的关注和几何投影分布，以投票给3D对象本地化的3D质心建议。使用晚期融合和预测的3D方向和尺寸，可以从单个RGB图像中检测到对象的3D边界框。该方法是直接的，但比其他基于单眼的方法明显优于。关于挑战性基准测试的广泛实验结果验证了所提出的方法的有效性。

Monocular 3D object detection aims to detect objects in a 3D physical world from a single camera. However, recent approaches either rely on expensive LiDAR devices, or resort to dense pixel-wise depth estimation that causes prohibitive computational cost. In this paper, we propose an end-to-end trainable monocular 3D object detector without learning the dense depth. Specifically, the grid coordinates of a 2D box are first projected back to 3D space with the pinhole model as 3D centroids proposals. Then, a novel object-aware voting approach is introduced, which considers both the region-wise appearance attention and the geometric projection distribution, to vote the 3D centroid proposals for 3D object localization. With the late fusion and the predicted 3D orientation and dimension, the 3D bounding boxes of objects can be detected from a single RGB image. The method is straightforward yet significantly superior to other monocular-based methods. Extensive experimental results on the challenging KITTI benchmark validate the effectiveness of the proposed method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题