论文标题
从低射击中进行的许多射击:学习使用混合监督进行对象检测来学习注释
Many-shot from Low-shot: Learning to Annotate using Mixed Supervision for Object Detection
论文作者
论文摘要
对象检测通过依靠大型手动注释的数据集见证了重大进展。注释此类数据集是非常耗时且昂贵的,这激发了弱监督和少量对象检测方法的发展。但是,这些方法在强烈监督的对手方面在很大程度上表现不佳,因为较弱的训练信号\ emph {经常}会导致部分或超大检测。为了解决这个问题,我们首次介绍了一个在线注释模块(OAM),该模块学会从大量弱标记的图像中生成一组\ emph {可靠}注释。我们的OAM可以通过任何完全监督的两阶段对象检测方法共同培训,从而随时提供其他培训注释。这导致了完全端到端的策略,仅需要一组低镜头的完全注释的图像。 OAM与快速(ER)R-CNN的集成使其性能提高了$ 17 \%$ MAP,$ 9 \%$ $ AP50在Pascal VOC 2007和MS-Coco基准上,并使用混合监督优于相互竞争的方法。
Object detection has witnessed significant progress by relying on large, manually annotated datasets. Annotating such datasets is highly time consuming and expensive, which motivates the development of weakly supervised and few-shot object detection methods. However, these methods largely underperform with respect to their strongly supervised counterpart, as weak training signals \emph{often} result in partial or oversized detections. Towards solving this problem we introduce, for the first time, an online annotation module (OAM) that learns to generate a many-shot set of \emph{reliable} annotations from a larger volume of weakly labelled images. Our OAM can be jointly trained with any fully supervised two-stage object detection method, providing additional training annotations on the fly. This results in a fully end-to-end strategy that only requires a low-shot set of fully annotated images. The integration of the OAM with Fast(er) R-CNN improves their performance by $17\%$ mAP, $9\%$ AP50 on PASCAL VOC 2007 and MS-COCO benchmarks, and significantly outperforms competing methods using mixed supervision.