论文标题
IFAN:用于自适应对象检测的图像 - 实体完整对齐网络
iFAN: Image-Instance Full Alignment Networks for Adaptive Object Detection
论文作者
论文摘要
在数据丰富的域上训练对象探测器,并将其应用于绩效下降有限的数据贫乏的域在行业中非常有吸引力,因为它节省了巨大的注释成本。关于无监督的域自适应对象检测的最新研究证实,通过对抗性学习对源图像和目标图像之间的数据分布对齐非常有用。关键是何时,何地和如何使用它来实现最佳实践。我们提出图像 - 实体完整的对齐网络(IFAN)来解决此问题,通过在图像和实例级别上精确地对齐特征分布:1)图像级对齐:多尺度特征通过训练对对抗性域的分类器在近端的方式中大致对齐。 2)完整的实例级别对齐:深度语义信息和精心阐述的实例表示形式被充分利用,以在类别和域之间建立牢固的关系。通过仔细构造实例对,建立这些相关性作为度量学习问题被提出。上述适应性可以集成到对象检测器(例如更快的RCNN)中,从而导致端到端可训练的框架,在该框架中,多个比对可以以粗丝额方式进行协作。在两个领域的适应任务中:合成对真实(SIM10K-> CityScapes)和正常的天气(CityScapes-> Foggy CityScapes),IFAN的表现优于仅源基线的最先进方法,其提升为10%+ AP。
Training an object detector on a data-rich domain and applying it to a data-poor one with limited performance drop is highly attractive in industry, because it saves huge annotation cost. Recent research on unsupervised domain adaptive object detection has verified that aligning data distributions between source and target images through adversarial learning is very useful. The key is when, where and how to use it to achieve best practice. We propose Image-Instance Full Alignment Networks (iFAN) to tackle this problem by precisely aligning feature distributions on both image and instance levels: 1) Image-level alignment: multi-scale features are roughly aligned by training adversarial domain classifiers in a hierarchically-nested fashion. 2) Full instance-level alignment: deep semantic information and elaborate instance representations are fully exploited to establish a strong relationship among categories and domains. Establishing these correlations is formulated as a metric learning problem by carefully constructing instance pairs. Above-mentioned adaptations can be integrated into an object detector (e.g. Faster RCNN), resulting in an end-to-end trainable framework where multiple alignments can work collaboratively in a coarse-tofine manner. In two domain adaptation tasks: synthetic-to-real (SIM10K->Cityscapes) and normal-to-foggy weather (Cityscapes->Foggy Cityscapes), iFAN outperforms the state-of-the-art methods with a boost of 10%+ AP over the source-only baseline.