研究无监督的视觉表示形象的间图像不变性

论文标题

研究无监督的视觉表示形象的间图像不变性

Delving into Inter-Image Invariance for Unsupervised Visual Representations

论文作者

Xie, Jiahao, Zhan, Xiaohang, Liu, Ziwei, Ong, Yew Soon, Loy, Chen Change

论文摘要

对比度学习最近在无监督的视觉表示学习中显示出巨大的潜力。在此轨道中的现有研究主要集中于图像内不变性学习。学习通常使用丰富的图像内变换来构建正对，然后使用对比度损失最大化一致性。相反，相互形象不变性的优点仍然少得多。利用间歇间不变性的一个主要障碍是，尚不清楚如何可靠地构建图像间的正对，并进一步从它们中获得有效的监督，因为没有配对注释可用。在这项工作中，我们提出了一项全面的经验研究，以更好地了解从三个主要组成部分的形象间不变性学习的作用：伪标签维护，采样策略和决策界限设计。为了促进研究，我们引入了一个统一的通用框架，该框架支持无监督的内部和间形内不变性学习的整合。通过精心设计的比较和分析，揭示了多个有价值的观察结果：1）在线标签收敛的速度比离线标签更快； 2）半硬性样本比硬性负样品更可靠和公正； 3）一个不太严格的决策边界更有利于形象间的不变性学习。借助所有获得的食谱，我们的最终模型，即InterCLR，对多个标准基准的最先进的内图内不变性学习方法表现出一致的改进。我们希望这项工作将为设计有效的无监督间歇性不变性学习提供有用的经验。代码：https：//github.com/open-mmlab/mmselfsup。

Contrastive learning has recently shown immense potential in unsupervised visual representation learning. Existing studies in this track mainly focus on intra-image invariance learning. The learning typically uses rich intra-image transformations to construct positive pairs and then maximizes agreement using a contrastive loss. The merits of inter-image invariance, conversely, remain much less explored. One major obstacle to exploit inter-image invariance is that it is unclear how to reliably construct inter-image positive pairs, and further derive effective supervision from them since no pair annotations are available. In this work, we present a comprehensive empirical study to better understand the role of inter-image invariance learning from three main constituting components: pseudo-label maintenance, sampling strategy, and decision boundary design. To facilitate the study, we introduce a unified and generic framework that supports the integration of unsupervised intra- and inter-image invariance learning. Through carefully-designed comparisons and analysis, multiple valuable observations are revealed: 1) online labels converge faster and perform better than offline labels; 2) semi-hard negative samples are more reliable and unbiased than hard negative samples; 3) a less stringent decision boundary is more favorable for inter-image invariance learning. With all the obtained recipes, our final model, namely InterCLR, shows consistent improvements over state-of-the-art intra-image invariance learning methods on multiple standard benchmarks. We hope this work will provide useful experience for devising effective unsupervised inter-image invariance learning. Code: https://github.com/open-mmlab/mmselfsup.

下载PDF全文

下载文献需遵守相关版权规定

论文标题