用弱监督的检测变压器扩展新颖的对象检测

论文标题

用弱监督的检测变压器扩展新颖的对象检测

Scaling Novel Object Detection with Weakly Supervised Detection Transformers

论文作者

LaBonte, Tyler, Song, Yale, Wang, Xin, Vineet, Vibhav, Joshi, Neel

论文摘要

关键的对象检测任务是对现有模型进行填充以检测新颖的对象，但是标准工作流程需要限制的框注释，这些框既耗时又昂贵。弱监督的对象检测（WSOD）提供了一种吸引人的替代方案，可以在其中使用图像级标签训练对象探测器。但是，当前WSOD模型的实际应用是有限的，因为它们仅在小数据量表上运行，并且需要进行多轮培训和改进。为了解决这个问题，我们提出了弱监督的检测变压器，从而使知识转移从大规模的预处理数据集转移到数百个新物体的WSOD FINETUNETUNTUNETUNETUNETUNTUNETUNETUNETUNTUNETUNETUNETUNETUNETUNTUNETUNTUNETUNTUNETUNTUNTUNTUNETUNTUNTUNETUNTUNTUNTUNTUNTUNTUNTUNTUNTUNTUNTUNTUNTUNTUNTUNTUNTUNTUNTUNTUNTUNTUNTUNTUNTUNTUNTUNET对其进行数百个新颖对象。此外，我们利用预验证的知识来改善WSOD方法中经常使用的多个实例学习（MIL）框架。我们的实验表明，我们的方法在大型新颖对象检测数据集上的表现优于先前的最先进模型，并且我们的缩放研究表明，对于WSOD预处理而言，类数量比图像数量更重要。该代码可在https://github.com/tmlabonte/weakly-supervised-detr上找到。

A critical object detection task is finetuning an existing model to detect novel objects, but the standard workflow requires bounding box annotations which are time-consuming and expensive to collect. Weakly supervised object detection (WSOD) offers an appealing alternative, where object detectors can be trained using image-level labels. However, the practical application of current WSOD models is limited, as they only operate at small data scales and require multiple rounds of training and refinement. To address this, we propose the Weakly Supervised Detection Transformer, which enables efficient knowledge transfer from a large-scale pretraining dataset to WSOD finetuning on hundreds of novel objects. Additionally, we leverage pretrained knowledge to improve the multiple instance learning (MIL) framework often used in WSOD methods. Our experiments show that our approach outperforms previous state-of-the-art models on large-scale novel object detection datasets, and our scaling study reveals that class quantity is more important than image quantity for WSOD pretraining. The code is available at https://github.com/tmlabonte/weakly-supervised-DETR.

下载PDF全文

下载文献需遵守相关版权规定

论文标题