论文标题
在大点云中用于对象检测的变压器
Transformers for Object Detection in Large Point Clouds
论文作者
论文摘要
我们提出Transppc,这是一个基于变压器架构的大点云的新颖检测模型。尽管使用变压器的对象检测是一个积极的研究领域,但事实证明,将这种模型应用于跨越大面积的点云,例如那些在自动驾驶中常见的那些,带有LIDAR或雷达数据。 Translpc能够解决这些问题:修改了变压器模型的结构以允许更大的输入序列长度,这对于大点云就足够了。除此之外,我们还提出了一种新型的查询精炼技术,以提高检测准确性,同时保留了友好数量的变压器解码器查询。这些查询是在层之间重新定位的,将它们更接近它们正在估计的边界框,以有效的方式将其移动。这项简单的技术对检测准确性具有重要影响,该技术对现实世界中的挑战性Nuscenes数据集进行了评估。除此之外,所提出的方法与需要对象检测的现有基于变压器的解决方案兼容,例如对于联合多对象跟踪和检测,并使它们与大点云结合使用。
We present TransLPC, a novel detection model for large point clouds that is based on a transformer architecture. While object detection with transformers has been an active field of research, it has proved difficult to apply such models to point clouds that span a large area, e.g. those that are common in autonomous driving, with lidar or radar data. TransLPC is able to remedy these issues: The structure of the transformer model is modified to allow for larger input sequence lengths, which are sufficient for large point clouds. Besides this, we propose a novel query refinement technique to improve detection accuracy, while retaining a memory-friendly number of transformer decoder queries. The queries are repositioned between layers, moving them closer to the bounding box they are estimating, in an efficient manner. This simple technique has a significant effect on detection accuracy, which is evaluated on the challenging nuScenes dataset on real-world lidar data. Besides this, the proposed method is compatible with existing transformer-based solutions that require object detection, e.g. for joint multi-object tracking and detection, and enables them to be used in conjunction with large point clouds.