通过快速，稳定的任务适应来更快地学习并忘记慢慢

论文标题

通过快速，稳定的任务适应来更快地学习并忘记慢慢

Learn Faster and Forget Slower via Fast and Stable Task Adaptation

论文作者

Varno, Farshid, Petry, Lucas May, Di Jorio, Lisa, Matwin, Stan

论文摘要

训练深度神经网络（DNNS）仍然非常耗时和计算密集型。已经表明，适应预处理的模型可能会显着加速此过程。以分类为重点，我们表明当前的微调技术使验证的模型在灾难性上甚至在学习新任务之前就忘记了转移的知识。这种快速的知识损失破坏了转移学习的优点，并且与利用最大知识量相比，收敛率较慢。我们从不同的角度研究了此问题的根源并减轻它，引入快速，稳定的任务适应（快速），易于应用微调算法。本文提供了一种新颖的几何观点，介绍了如何通过不同的转移学习策略链接源和目标任务的损失格局。我们从经验上表明，与盛行的微调实践相比，快速学习目标任务速度并忘记了源任务较慢。

Training Deep Neural Networks (DNNs) is still highly time-consuming and compute-intensive. It has been shown that adapting a pretrained model may significantly accelerate this process. With a focus on classification, we show that current fine-tuning techniques make the pretrained models catastrophically forget the transferred knowledge even before anything about the new task is learned. Such rapid knowledge loss undermines the merits of transfer learning and may result in a much slower convergence rate compared to when the maximum amount of knowledge is exploited. We investigate the source of this problem from different perspectives and to alleviate it, introduce Fast And Stable Task-adaptation (FAST), an easy to apply fine-tuning algorithm. The paper provides a novel geometric perspective on how the loss landscape of source and target tasks are linked in different transfer learning strategies. We empirically show that compared to prevailing fine-tuning practices, FAST learns the target task faster and forgets the source task slower.

下载PDF全文

下载文献需遵守相关版权规定

论文标题