论文标题
单位:任何弹射对象检测和细分的统一知识转移
UniT: Unified Knowledge Transfer for Any-shot Object Detection and Segmentation
论文作者
论文摘要
对象检测和分割的方法依赖于训练的大规模实例级注释,这些注释很难收集。减轻这种不同程度和监督质量的努力。弱监督的方法利用图像级标签来构建检测器/分段,而零/少数方法的方法为一组基类假设了丰富的实例级数据,而对于新颖类的几个示例都没有。这种分类学在很大程度上已经孤立了算法设计。在这项工作中,我们旨在通过提出一个直观和统一的半监督模型来弥合这种鸿沟,该模型适用于一系列监督:从零到每个新颖类的实例级级样本。对于基础类别,我们的模型从弱监督到完全监督的检测器/分段的映射中学习了映射。通过学习和利用小说和基础类之间的视觉和舌形相似性,我们将这些映射转移到新类别的探测器/分段。使用一些新颖的类实例级带注释的样本来完善它们,如果有的话。总体模型是端到端可训练且高度灵活的。通过对MS-Coco和Pascal VOC基准数据集进行的大量实验,我们在各种环境中显示出改进的性能。
Methods for object detection and segmentation rely on large scale instance-level annotations for training, which are difficult and time-consuming to collect. Efforts to alleviate this look at varying degrees and quality of supervision. Weakly-supervised approaches draw on image-level labels to build detectors/segmentors, while zero/few-shot methods assume abundant instance-level data for a set of base classes, and none to a few examples for novel classes. This taxonomy has largely siloed algorithmic designs. In this work, we aim to bridge this divide by proposing an intuitive and unified semi-supervised model that is applicable to a range of supervision: from zero to a few instance-level samples per novel class. For base classes, our model learns a mapping from weakly-supervised to fully-supervised detectors/segmentors. By learning and leveraging visual and lingual similarities between the novel and base classes, we transfer those mappings to obtain detectors/segmentors for novel classes; refining them with a few novel class instance-level annotated samples, if available. The overall model is end-to-end trainable and highly flexible. Through extensive experiments on MS-COCO and Pascal VOC benchmark datasets we show improved performance in a variety of settings.