转移学习或自学学习？一个两个预处理范式的故事

论文标题

转移学习或自学学习？一个两个预处理范式的故事

Transfer Learning or Self-supervised Learning? A Tale of Two Pretraining Paradigms

论文作者

Yang, Xingyi, He, Xuehai, Liang, Yuxiao, Yang, Yue, Zhang, Shanghang, Xie, Pengtao

论文摘要

预处理已成为计算机视觉和自然语言处理中的标准技术，通常有助于大大提高性能。以前，最主要的预处理方法是转移学习（TL），它使用标记的数据来学习良好的表示网络。最近，一种新的预处理方法 - 自学学习（SSL） - 在广泛的应用中证明了有希望的结果。 SSL不需要带注释的标签。它纯粹是通过解决输入数据示例中定义的辅助任务来对输入数据进行的。当前报告的结果表明，在某些应用程序中，SSL在其他应用程序中的表现优于TL，而另一种方式。对数据和任务的哪些属性使一种方法的表现尚不清楚。没有明智的指导方针，ML研究人员必须尝试两种方法，以找出哪种方法在经验上更好。这样做通常很耗时。在这项工作中，我们旨在解决这个问题。我们在SSL和TL之间进行了全面的比较研究，即在数据和任务的不同属性下，包括源和目标任务之间的领域差异，浏览数据的量，源数据中的类不平衡以及对目标数据的使用量以及额外的预周五预期等，从我们的比较研究中蒸馏出来的洞察力可以帮助ML研究人员确定其应用程序的应用程序。

Pretraining has become a standard technique in computer vision and natural language processing, which usually helps to improve performance substantially. Previously, the most dominant pretraining method is transfer learning (TL), which uses labeled data to learn a good representation network. Recently, a new pretraining approach -- self-supervised learning (SSL) -- has demonstrated promising results on a wide range of applications. SSL does not require annotated labels. It is purely conducted on input data by solving auxiliary tasks defined on the input data examples. The current reported results show that in certain applications, SSL outperforms TL and the other way around in other applications. There has not been a clear understanding on what properties of data and tasks render one approach outperforms the other. Without an informed guideline, ML researchers have to try both methods to find out which one is better empirically. It is usually time-consuming to do so. In this work, we aim to address this problem. We perform a comprehensive comparative study between SSL and TL regarding which one works better under different properties of data and tasks, including domain difference between source and target tasks, the amount of pretraining data, class imbalance in source data, and usage of target data for additional pretraining, etc. The insights distilled from our comparative studies can help ML researchers decide which method to use based on the properties of their applications.

下载PDF全文

下载文献需遵守相关版权规定

论文标题