深度互动：通过模态互动检测3D对象检测

论文标题

深度互动：通过模态互动检测3D对象检测

DeepInteraction: 3D Object Detection via Modality Interaction

论文作者

Yang, Zeyu, Chen, Jiaqi, Miao, Zhenwei, Li, Wei, Zhu, Xiatian, Zhang, Li

论文摘要

现有的最佳3D对象检测器通常依赖于多模式融合策略。但是，由于忽略了特定于模式的有用信息并最终阻碍了模型性能，因此从根本上限制了该设计。为了解决这一局限性，在这项工作中，我们介绍了一种新型的模式相互作用策略，在该策略中，在整个过程中学习和维护单个单模式表示，以使其在物体检测过程中被利用其独特特征。为了实现这种提出的策略，我们设计了一个深层互动体系结构，其特征是多模式代表性交互编码器和多模式预测交互解码器。大规模Nuscenes数据集的实验表明，我们的提议方法经常超过所有先前的艺术。至关重要的是，我们的方法在竞争激烈的Nuscenes对象检测排行榜上排名第一。

Existing top-performance 3D object detectors typically rely on the multi-modal fusion strategy. This design is however fundamentally restricted due to overlooking the modality-specific useful information and finally hampering the model performance. To address this limitation, in this work we introduce a novel modality interaction strategy where individual per-modality representations are learned and maintained throughout for enabling their unique characteristics to be exploited during object detection. To realize this proposed strategy, we design a DeepInteraction architecture characterized by a multi-modal representational interaction encoder and a multi-modal predictive interaction decoder. Experiments on the large-scale nuScenes dataset show that our proposed method surpasses all prior arts often by a large margin. Crucially, our method is ranked at the first position at the highly competitive nuScenes object detection leaderboard.

下载PDF全文

下载文献需遵守相关版权规定

论文标题