从视频中进行自我监督学习的像素级通信

论文标题

从视频中进行自我监督学习的像素级通信

Pixel-level Correspondence for Self-Supervised Learning from Video

论文作者

Sharma, Yash, Zhu, Yi, Russell, Chris, Brox, Thomas

论文摘要

尽管在没有标签的情况下，自我监督的学习使有效的表示学习能力，但视频仍然是相对尚未开发的监督来源。为了解决这个问题，我们提出了像素级对应（PICO），这是一种从视频中进行密集的对比度学习的方法。通过使用光流的跟踪点，我们获得了一个对应图，该图可用于匹配不同时间点的本地特征。我们在标准基准测试上验证了PICO，在多个密集的预测任务上表现优于自我监督的基线，而不会损害图像分类的性能。

While self-supervised learning has enabled effective representation learning in the absence of labels, for vision, video remains a relatively untapped source of supervision. To address this, we propose Pixel-level Correspondence (PiCo), a method for dense contrastive learning from video. By tracking points with optical flow, we obtain a correspondence map which can be used to match local features at different points in time. We validate PiCo on standard benchmarks, outperforming self-supervised baselines on multiple dense prediction tasks, without compromising performance on image classification.

下载PDF全文

下载文献需遵守相关版权规定

论文标题