基于能量的对比度学习视觉表示

论文标题

基于能量的对比度学习视觉表示

Energy-Based Contrastive Learning of Visual Representations

论文作者

Kim, Beomsu, Ye, Jong Chul

论文摘要

对比学习是一种通过训练深层神经网络（DNN）来学习视觉表示的方法，以增加积极对表示（同一图像的转换）之间的相似性，并降低负对表示（不同图像的转换）之间的相似性。在这里，我们探索基于能量的对比学习（EBCLR），该学习通过将对比度学习与基于能量的模型（EBM）相结合来利用生成学习的力量。理论上可以将EBCLR解释为学习正面的联合分布，并且在中小型数据集（如MNIST，时尚持续时间，CIFAR-10和CIFAR-100）上显示出令人鼓舞的结果。具体而言，我们发现与SIMCLR和MOCO V2相比，在训练时期，EBCLR从X4到X20加速度证明了EBCLR。此外，与SIMCLR相比，我们观察到EBCLR的性能几乎相同，每对正面254对（批次尺寸128）和30个负面对（批量16）的表现，这表明EBCLR对少量负对的稳健性。因此，EBCLR提供了一种新颖的途径，用于改善对比度学习方法，这些方法通常需要大型数据集，而每次迭代都有大量负面对，以在下游任务上实现合理的绩效。代码：https：//github.com/1202kbs/ebclr

Contrastive learning is a method of learning visual representations by training Deep Neural Networks (DNNs) to increase the similarity between representations of positive pairs (transformations of the same image) and reduce the similarity between representations of negative pairs (transformations of different images). Here we explore Energy-Based Contrastive Learning (EBCLR) that leverages the power of generative learning by combining contrastive learning with Energy-Based Models (EBMs). EBCLR can be theoretically interpreted as learning the joint distribution of positive pairs, and it shows promising results on small and medium-scale datasets such as MNIST, Fashion-MNIST, CIFAR-10, and CIFAR-100. Specifically, we find EBCLR demonstrates from X4 up to X20 acceleration compared to SimCLR and MoCo v2 in terms of training epochs. Furthermore, in contrast to SimCLR, we observe EBCLR achieves nearly the same performance with 254 negative pairs (batch size 128) and 30 negative pairs (batch size 16) per positive pair, demonstrating the robustness of EBCLR to small numbers of negative pairs. Hence, EBCLR provides a novel avenue for improving contrastive learning methods that usually require large datasets with a significant number of negative pairs per iteration to achieve reasonable performance on downstream tasks. Code: https://github.com/1202kbs/EBCLR

下载PDF全文

下载文献需遵守相关版权规定

论文标题