论文标题

ARM3D:室内3D对象检测的基于注意力的关系模块

ARM3D: Attention-based relation module for indoor 3D object detection

论文作者

Lan, Yuqing, Duan, Yao, Liu, Chenyi, Zhu, Chenyang, Xiong, Yueshan, Huang, Hui, Xu, Kai

论文摘要

事实证明,关系上下文对于许多具有挑战性的视力任务很有用。在3D对象检测的字段中,以前的方法已经利用上下文编码,图形嵌入或显式关系推理来提取关系上下文。但是,由于嘈杂或低质量的建议,存在不可避免的冗余关系上下文。实际上,无效的关系上下文通常表明基本场景的误解和歧义,相反,这可能会降低复杂场景中的性能。受到诸如变压器之类的最近注意机制的启发,我们提出了一个新型的基于3D注意力的关系模块(ARM3D)。它包含对象感知的关系推理,以在合格的建议中提取成对关系上下文,以及将注意力重量分配给不同关系环境的注意力模块。这样,ARM3D可以充分利用有用的关系上下文,并过滤那些不太相关甚至令人困惑的环境,从而减轻检测的歧义。我们通过将ARM3D插入几个最新的3D对象检测器并显示出更准确,更强大的检测结果来评估ARM3D的有效性。广泛的实验显示了3D对象检测上ARM3D的能力和概括。我们的源代码可在https://github.com/lanlan96/arm3d上找到。

Relation context has been proved to be useful for many challenging vision tasks. In the field of 3D object detection, previous methods have been taking the advantage of context encoding, graph embedding, or explicit relation reasoning to extract relation context. However, there exists inevitably redundant relation context due to noisy or low-quality proposals. In fact, invalid relation context usually indicates underlying scene misunderstanding and ambiguity, which may, on the contrary, reduce the performance in complex scenes. Inspired by recent attention mechanism like Transformer, we propose a novel 3D attention-based relation module (ARM3D). It encompasses object-aware relation reasoning to extract pair-wise relation contexts among qualified proposals and an attention module to distribute attention weights towards different relation contexts. In this way, ARM3D can take full advantage of the useful relation context and filter those less relevant or even confusing contexts, which mitigates the ambiguity in detection. We have evaluated the effectiveness of ARM3D by plugging it into several state-of-the-art 3D object detectors and showing more accurate and robust detection results. Extensive experiments show the capability and generalization of ARM3D on 3D object detection. Our source code is available at https://github.com/lanlan96/ARM3D.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源