论文标题
可乐:用于稀疏激光雷达数据集的3D语义分割的粗制标签预训练
COLA: COarse LAbel pre-training for 3D semantic segmentation of sparse LiDAR datasets
论文作者
论文摘要
转移学习是2D计算机愿景中的一种经过验证的技术,可以利用可用的大量数据并获得高性能,而数据集则由于收购或注释的成本而受到限制。在3D中,注释是一项昂贵的任务。然而,直到最近才研究了培训前方法。由于这一成本,无监督的预培训受到了极大的青睐。在这项工作中,我们解决了稀疏自动驾驶激光扫描的实时3D语义分割的案例。此类数据集已越来越释放,但是每个数据集都有一个唯一的标签集。我们在这里提出了一个名为“粗标签”的中级标签集,可以轻松地用于任何现有和将来的自主驾驶数据集,从而允许立即利用所有可用数据,而无需任何其他手动标签。这样,我们就可以访问较大的数据集,以及简单的语义分割任务。有了它,我们引入了一项新的预训练任务:粗制标签预训练,也称为可乐。我们彻底分析了可乐对各种数据集和体系结构的影响,并表明它可以提高性能,尤其是当只有一个小数据集可用于填充任务时。
Transfer learning is a proven technique in 2D computer vision to leverage the large amount of data available and achieve high performance with datasets limited in size due to the cost of acquisition or annotation. In 3D, annotation is known to be a costly task; nevertheless, pre-training methods have only recently been investigated. Due to this cost, unsupervised pre-training has been heavily favored. In this work, we tackle the case of real-time 3D semantic segmentation of sparse autonomous driving LiDAR scans. Such datasets have been increasingly released, but each has a unique label set. We propose here an intermediate-level label set called coarse labels, which can easily be used on any existing and future autonomous driving datasets, thus allowing all the data available to be leveraged at once without any additional manual labeling. This way, we have access to a larger dataset, alongside a simple task of semantic segmentation. With it, we introduce a new pre-training task: coarse label pre-training, also called COLA. We thoroughly analyze the impact of COLA on various datasets and architectures and show that it yields a noticeable performance improvement, especially when only a small dataset is available for the finetuning task.