论文标题
从视频中进行自我监督学习的像素级通信
Pixel-level Correspondence for Self-Supervised Learning from Video
论文作者
论文摘要
尽管在没有标签的情况下,自我监督的学习使有效的表示学习能力,但视频仍然是相对尚未开发的监督来源。为了解决这个问题,我们提出了像素级对应(PICO),这是一种从视频中进行密集的对比度学习的方法。通过使用光流的跟踪点,我们获得了一个对应图,该图可用于匹配不同时间点的本地特征。我们在标准基准测试上验证了PICO,在多个密集的预测任务上表现优于自我监督的基线,而不会损害图像分类的性能。
While self-supervised learning has enabled effective representation learning in the absence of labels, for vision, video remains a relatively untapped source of supervision. To address this, we propose Pixel-level Correspondence (PiCo), a method for dense contrastive learning from video. By tracking points with optical flow, we obtain a correspondence map which can be used to match local features at different points in time. We validate PiCo on standard benchmarks, outperforming self-supervised baselines on multiple dense prediction tasks, without compromising performance on image classification.