论文标题
可证明的全球对比学习的随机优化:小批量不会损害性能
Provable Stochastic Optimization for Global Contrastive Learning: Small Batch Does Not Harm Performance
论文作者
论文摘要
在本文中,我们从优化的角度研究了对比度学习,旨在分析和解决现有的对比学习方法的基本问题,这些方法依靠大批量大小或大型矢量词典。我们考虑了对比度学习的全球目标,该目标将每个正对与锚点的所有负对对比。从优化的角度来看,我们解释了为什么诸如SIMCLR之类的现有方法需要大批量大小才能获得令人满意的结果。为了消除此类要求,我们提出了一种记忆有效的随机优化算法,用于求解名为SOGCLR的对比度学习的全局目标。我们表明,在足够数量的迭代次数之后,在合理条件下,其优化误差可以忽略不计,或者对于一个略有不同的全局对比目标而减少。从经验上讲,我们证明具有小批量大小的SOGCLR(例如256)可以在ImagEnet-1k上的自我监督的学习任务上获得与具有较大批次大小(例如8192)的SIMCLR相似的性能。我们还试图证明所提出的优化技术是通用的,可以应用于解决其他对比损失,例如,双向对比度学习的双向对比损失。提出的方法是在我们开源的图书馆libauc(www.libauc.org)中实现的。
In this paper, we study contrastive learning from an optimization perspective, aiming to analyze and address a fundamental issue of existing contrastive learning methods that either rely on a large batch size or a large dictionary of feature vectors. We consider a global objective for contrastive learning, which contrasts each positive pair with all negative pairs for an anchor point. From the optimization perspective, we explain why existing methods such as SimCLR require a large batch size in order to achieve a satisfactory result. In order to remove such requirement, we propose a memory-efficient Stochastic Optimization algorithm for solving the Global objective of Contrastive Learning of Representations, named SogCLR. We show that its optimization error is negligible under a reasonable condition after a sufficient number of iterations or is diminishing for a slightly different global contrastive objective. Empirically, we demonstrate that SogCLR with small batch size (e.g., 256) can achieve similar performance as SimCLR with large batch size (e.g., 8192) on self-supervised learning task on ImageNet-1K. We also attempt to show that the proposed optimization technique is generic and can be applied to solving other contrastive losses, e.g., two-way contrastive losses for bimodal contrastive learning. The proposed method is implemented in our open-sourced library LibAUC (www.libauc.org).