多域语义分段的通用视觉概念的弱监督培训

论文标题

多域语义分段的通用视觉概念的弱监督培训

Weakly supervised training of universal visual concepts for multi-domain semantic segmentation

论文作者

Bevandić, Petra, Oršić, Marin, Grubišić, Ivan, Šarić, Josip, Šegvić, Siniša

论文摘要

深层监督模型具有吸收大量培训数据的空前能力。因此，在多个数据集上进行培训成为一种在通常场景中进行强有力概括的选择方法，并且在边缘案例中的优雅性能下降。不幸的是，不同的数据集通常具有不兼容的标签。例如，CityScapes Road类均包含所有驾驶表面，而Vistas为路标，人孔等定义了单独的类。此外，许多数据集都具有重叠的标签。例如，拾音器被标记为Viper中的卡车，Vistas的汽车和ADE20K的Vans。我们通过将标签视为通用视觉概念的工会来应对这一挑战。这允许在多域数据集集合上进行无缝和原则性的学习，而无需进行任何重新标记的努力。我们的方法实现了竞争性内部和交叉数据集的概括，以及学习在任何培训数据集中未单独标记的视觉概念的能力。实验揭示了两个多域数据集集合以及Wilddash 2基准上的竞争性或最先进的性能。

Deep supervised models have an unprecedented capacity to absorb large quantities of training data. Hence, training on multiple datasets becomes a method of choice towards strong generalization in usual scenes and graceful performance degradation in edge cases. Unfortunately, different datasets often have incompatible labels. For instance, the Cityscapes road class subsumes all driving surfaces, while Vistas defines separate classes for road markings, manholes etc. Furthermore, many datasets have overlapping labels. For instance, pickups are labeled as trucks in VIPER, cars in Vistas, and vans in ADE20k. We address this challenge by considering labels as unions of universal visual concepts. This allows seamless and principled learning on multi-domain dataset collections without requiring any relabeling effort. Our method achieves competitive within-dataset and cross-dataset generalization, as well as ability to learn visual concepts which are not separately labeled in any of the training datasets. Experiments reveal competitive or state-of-the-art performance on two multi-domain dataset collections and on the WildDash 2 benchmark.

下载PDF全文

下载文献需遵守相关版权规定

论文标题