论文标题

用分层视觉知识蒸馏开放式视频计一阶段检测

Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge Distillation

论文作者

Ma, Zongyang, Luo, Guan, Gao, Jin, Li, Liang, Chen, Yuxin, Wang, Shaoru, Zhang, Congxuan, Hu, Weiming

论文摘要

开放式摄制对象检测旨在检测训练集以外的新对象类别。 先进的开放式摄影剂两阶段探测器采用实例级别的视觉到视觉知识蒸馏,将检测器的视觉空间与预训练的视觉语言模型(PVLM)的语义空间保持一致。 但是,在更有效的一阶段检测器中,缺乏类不足的对象建议阻碍了对看不见的物体的知识蒸馏,从而导致严重的性能降解。 在本文中,我们提出了一种层次的视觉知识蒸馏方法,即hierkd,用于开放式摄影单阶段检测。 具体而言,探索了全球知识蒸馏,以将未见类别的知识从PVLM转移到检测器。 此外,我们结合了所提出的全球知识蒸馏和共同的实例级知识蒸馏,以同时了解可见和看不见类别的知识。 在MS-Coco上进行的广泛实验表明,我们的方法显着超过了先前的最佳一阶段检测器,而在零射击检测和全面的零射击检测设置下获得了11.9 \%\%\%$ $ ap_ {50} $,并从14 \%降低了2.3%的$ ap_ {50} $ apervice gap ftectition,并降低了$ ap_ {50} $ apertage ft两次\%\%的效果。

Open-vocabulary object detection aims to detect novel object categories beyond the training set. The advanced open-vocabulary two-stage detectors employ instance-level visual-to-visual knowledge distillation to align the visual space of the detector with the semantic space of the Pre-trained Visual-Language Model (PVLM). However, in the more efficient one-stage detector, the absence of class-agnostic object proposals hinders the knowledge distillation on unseen objects, leading to severe performance degradation. In this paper, we propose a hierarchical visual-language knowledge distillation method, i.e., HierKD, for open-vocabulary one-stage detection. Specifically, a global-level knowledge distillation is explored to transfer the knowledge of unseen categories from the PVLM to the detector. Moreover, we combine the proposed global-level knowledge distillation and the common instance-level knowledge distillation to learn the knowledge of seen and unseen categories simultaneously. Extensive experiments on MS-COCO show that our method significantly surpasses the previous best one-stage detector with 11.9\% and 6.7\% $AP_{50}$ gains under the zero-shot detection and generalized zero-shot detection settings, and reduces the $AP_{50}$ performance gap from 14\% to 7.3\% compared to the best two-stage detector.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源