迷彩摄像机：与摄像头融合的3D多对象跟踪的外观运动优化

论文标题

迷彩摄像机：与摄像头融合的3D多对象跟踪的外观运动优化

CAMO-MOT: Combined Appearance-Motion Optimization for 3D Multi-Object Tracking with Camera-LiDAR Fusion

论文作者

Wang, Li, Zhang, Xinyu, Qin, Wenyuan, Li, Xiaoyu, Yang, Lei, Li, Zhiwei, Zhu, Lei, Wang, Hong, Li, Jun, Liu, Huaping

论文摘要

3D多对象跟踪（MOT）确保在连续动态检测过程中保持一致性，有利于自动驾驶中随后的运动计划和导航任务。但是，在闭塞情况下，基于摄像机的方法会受到损害，准确跟踪基于激光雷达的方法的对象的不规则运动可能是具有挑战性的。某些融合方法效果很好，但不认为在遮挡下出现外观特征的问题不值得。同时，错误检测问题也会显着影响跟踪。因此，我们根据组合的外观运动优化（Camo-Mot）提出了一种新型的相机融合3D MOT框架，该框架同时使用相机和激光镜数据，并大大减少了由遮挡和错误检测引起的跟踪失败。对于遮挡问题，我们是第一个提出遮挡头以有效地选择最佳对象外观的人，从而减少了闭塞的影响。为了减少错误检测在跟踪中的影响，我们根据置信得分设计一个运动成本矩阵，从而提高了3D空间中的定位和对象预测准确性。由于现有的多目标跟踪方法仅考虑一个类别，因此我们还建议建立多类损失，以在多类别场景中实现多目标跟踪。在Kitti和Nuscenes跟踪基准测试的基准和Nuscenes上进行了一系列验证实验。我们提出的方法在KITTI测试数据集上的所有多模式MOT方法中实现了最先进的性能和最低的身份开关（IDS）值（CAR为23，行人为137）。而且我们提出的方法在Nuscenes测试数据集上以75.3％的AMOTA进行了所有算法中的最新性能。

3D Multi-object tracking (MOT) ensures consistency during continuous dynamic detection, conducive to subsequent motion planning and navigation tasks in autonomous driving. However, camera-based methods suffer in the case of occlusions and it can be challenging to accurately track the irregular motion of objects for LiDAR-based methods. Some fusion methods work well but do not consider the untrustworthy issue of appearance features under occlusion. At the same time, the false detection problem also significantly affects tracking. As such, we propose a novel camera-LiDAR fusion 3D MOT framework based on the Combined Appearance-Motion Optimization (CAMO-MOT), which uses both camera and LiDAR data and significantly reduces tracking failures caused by occlusion and false detection. For occlusion problems, we are the first to propose an occlusion head to select the best object appearance features multiple times effectively, reducing the influence of occlusions. To decrease the impact of false detection in tracking, we design a motion cost matrix based on confidence scores which improve the positioning and object prediction accuracy in 3D space. As existing multi-object tracking methods only consider a single category, we also propose to build a multi-category loss to implement multi-object tracking in multi-category scenes. A series of validation experiments are conducted on the KITTI and nuScenes tracking benchmarks. Our proposed method achieves state-of-the-art performance and the lowest identity switches (IDS) value (23 for Car and 137 for Pedestrian) among all multi-modal MOT methods on the KITTI test dataset. And our proposed method achieves state-of-the-art performance among all algorithms on the nuScenes test dataset with 75.3% AMOTA.

下载PDF全文

下载文献需遵守相关版权规定

论文标题