论文标题
通过细心的学习探索自下而上的线索,以进行哭泣的对象检测
Exploring Bottom-up and Top-down Cues with Attentive Learning for Webly Supervised Object Detection
论文作者
论文摘要
近年来,完全监督的对象检测取得了巨大的成功。但是,需要大量的边界框注释来训练探测器的新课程。为了减少人类的标签工作,我们为新颖的类别提出了一种新颖的Webly监督对象检测方法(WebSOD)方法,该方法仅需要网络图像而无需进一步的注释。我们提出的方法结合了自下而上的提示,用于新的类检测。在我们的方法中,我们通过识别基础和新颖类共享的共同客观性,介绍了基于训练有素的完全监督对象检测器(即更快的RCNN)作为对象区域估计器作为对象区域估计器的自下而上的机制。借助Web图像上的估计区域,我们随后利用自上而下的注意线索作为区域分类的指南。此外,我们提出了一个残留特征改进(RFR)块,以应对Web域和目标域之间的域不匹配。我们通过三种不同的新颖/基础分裂演示了关于Pascal VOC数据集的建议方法。如果没有任何目标域的新颖级图像和注释,我们提出的WEBLY监督对象检测模型将能够实现新颖类的有希望的表现。此外,我们还对大规模ILSVRC 2013检测数据集进行了转移学习实验,并实现最先进的性能。
Fully supervised object detection has achieved great success in recent years. However, abundant bounding boxes annotations are needed for training a detector for novel classes. To reduce the human labeling effort, we propose a novel webly supervised object detection (WebSOD) method for novel classes which only requires the web images without further annotations. Our proposed method combines bottom-up and top-down cues for novel class detection. Within our approach, we introduce a bottom-up mechanism based on the well-trained fully supervised object detector (i.e. Faster RCNN) as an object region estimator for web images by recognizing the common objectiveness shared by base and novel classes. With the estimated regions on the web images, we then utilize the top-down attention cues as the guidance for region classification. Furthermore, we propose a residual feature refinement (RFR) block to tackle the domain mismatch between web domain and the target domain. We demonstrate our proposed method on PASCAL VOC dataset with three different novel/base splits. Without any target-domain novel-class images and annotations, our proposed webly supervised object detection model is able to achieve promising performance for novel classes. Moreover, we also conduct transfer learning experiments on large scale ILSVRC 2013 detection dataset and achieve state-of-the-art performance.