论文标题
MLCVNET:3D对象检测的多层次上下文投票
MLCVNet: Multi-Level Context VoteNet for 3D Object Detection
论文作者
论文摘要
在本文中,我们通过使用自我发挥机制和多尺度特征融合来捕获多层次上下文信息来解决3D对象检测任务。大多数现有的3D对象检测方法单独识别对象,而无需对这些对象之间的上下文信息进行任何考虑。相比之下,我们建议多级上下文投票(MLCVNET)以最新的votenet为基础,以识别3D对象。我们将三个上下文模块引入投票和分类阶段,以在不同级别上编码上下文信息。具体而言,在投票支持其相应的对象质心之前,使用补丁到斑点上下文(PPC)模块在点贴片之间捕获上下文信息。随后,在建议和分类阶段之前合并了对象对象上下文(OOC)模块,以捕获对象候选者之间的上下文信息。最后,全局场景上下文(GSC)模块旨在学习全局场景上下文。我们通过在补丁,对象和场景级别捕获上下文信息来演示这些信息。我们的方法是促进检测准确性,在挑战3D对象检测数据集(即Sun RGBD和Scannet)上实现新的最新检测性能的有效方法。我们还在https://github.com/nuaaxq/mlcvnet上发布代码。
In this paper, we address the 3D object detection task by capturing multi-level contextual information with the self-attention mechanism and multi-scale feature fusion. Most existing 3D object detection methods recognize objects individually, without giving any consideration on contextual information between these objects. Comparatively, we propose Multi-Level Context VoteNet (MLCVNet) to recognize 3D objects correlatively, building on the state-of-the-art VoteNet. We introduce three context modules into the voting and classifying stages of VoteNet to encode contextual information at different levels. Specifically, a Patch-to-Patch Context (PPC) module is employed to capture contextual information between the point patches, before voting for their corresponding object centroid points. Subsequently, an Object-to-Object Context (OOC) module is incorporated before the proposal and classification stage, to capture the contextual information between object candidates. Finally, a Global Scene Context (GSC) module is designed to learn the global scene context. We demonstrate these by capturing contextual information at patch, object and scene levels. Our method is an effective way to promote detection accuracy, achieving new state-of-the-art detection performance on challenging 3D object detection datasets, i.e., SUN RGBD and ScanNet. We also release our code at https://github.com/NUAAXQ/MLCVNet.