FusionRcnn：两阶段3D对象检测的LIDAR-CAMERA融合

论文标题

FusionRcnn：两阶段3D对象检测的LIDAR-CAMERA融合

FusionRCNN: LiDAR-Camera Fusion for Two-stage 3D Object Detection

论文作者

Xu, Xinli, Dong, Shaocong, Ding, Lihe, Wang, Jie, Xu, Tingfa, Li, Jianan

论文摘要

具有多传感器的3D对象检测对于自动驾驶和机器人技术的准确可靠感知系统至关重要。现有的3D检测器通过采用两阶段范式来显着提高准确性，这仅依靠LiDAR Point Clouds进行3D提议的改进。尽管令人印象深刻，但点云的稀疏性，尤其是对于遥远的点，使得仅激光雷达的精炼模块难以准确识别和找到对象。为了解决这个问题，我们提出了一种新型的多模式的两级方法，名为Fusionrcnn，有效地，有效地融合了点云和相机图像（ROI（ROI））。 FusionRcnn自适应地整合了LiDAR的稀疏几何信息和统一注意机制中相机的密集纹理信息。具体而言，它首先利用RoiPooling获得具有统一大小的图像集，并通过在ROI提取步骤中的建议中抽样原始点来获得点；然后利用模式内的自我注意力来增强域特异性特征，此后通过精心设计的交叉注意力融合了来自两种模态的信息。FusionRCNN从根本上是插入插件，并支持不同的单阶段方法，几乎没有建筑变化。对Kitti和Waymo基准测试的广泛实验表明，我们的方法显着提高了流行探测器的性能。可取，FusionRCNN显着提高了强大的第二基线在Waymo上的地图6.14％，并且超过了两阶段方法的表现。代码将很快在https://github.com/xxlbigbrother/fusion-rcnn上发布。

3D object detection with multi-sensors is essential for an accurate and reliable perception system of autonomous driving and robotics. Existing 3D detectors significantly improve the accuracy by adopting a two-stage paradigm which merely relies on LiDAR point clouds for 3D proposal refinement. Though impressive, the sparsity of point clouds, especially for the points far away, making it difficult for the LiDAR-only refinement module to accurately recognize and locate objects.To address this problem, we propose a novel multi-modality two-stage approach named FusionRCNN, which effectively and efficiently fuses point clouds and camera images in the Regions of Interest(RoI). FusionRCNN adaptively integrates both sparse geometry information from LiDAR and dense texture information from camera in a unified attention mechanism. Specifically, it first utilizes RoIPooling to obtain an image set with a unified size and gets the point set by sampling raw points within proposals in the RoI extraction step; then leverages an intra-modality self-attention to enhance the domain-specific features, following by a well-designed cross-attention to fuse the information from two modalities.FusionRCNN is fundamentally plug-and-play and supports different one-stage methods with almost no architectural changes. Extensive experiments on KITTI and Waymo benchmarks demonstrate that our method significantly boosts the performances of popular detectors.Remarkably, FusionRCNN significantly improves the strong SECOND baseline by 6.14% mAP on Waymo, and outperforms competing two-stage approaches. Code will be released soon at https://github.com/xxlbigbrother/Fusion-RCNN.

下载PDF全文

下载文献需遵守相关版权规定

论文标题