通过梯度辍学来正规化元学习

论文标题

通过梯度辍学来正规化元学习

Regularizing Meta-Learning via Gradient Dropout

论文作者

Tseng, Hung-Yu, Chen, Yi-Wen, Tsai, Yi-Hsuan, Liu, Sifei, Lin, Yen-Yu, Yang, Ming-Hsuan

论文摘要

随着仅使用几个示例的学习对学习新任务的关注日益加剧，元学习已被广泛用于许多问题，例如少量分类，强化学习和领域的概括。但是，当没有足够的训练任务供元学习者推广时，元学习模型容易过度拟合。尽管现有的方法（例如辍学方法）被广泛用于解决过度拟合问题，但这些方法通常是为了使监督培训中的单个任务的模型正式化。在本文中，我们引入了一种简单而有效的方法，以减轻过度适合基于梯度的元学习的风险。具体而言，在基于梯度的适应阶段，我们将梯度随机降低了深层神经网络中每个参数的内环优化，从而使增强梯度改善了对新任务的概括。我们提出了提出的梯度辍学正则化的一般形式，并表明该术语可以从Bernoulli或Gaussian分布中取样。为了验证所提出的方法，我们对许多计算机视觉任务进行了广泛的实验和分析，表明梯度辍学的正规化减轻了过度拟合问题，并改善了各种基于梯度的元学习框架的性能。

With the growing attention on learning-to-learn new tasks using only a few examples, meta-learning has been widely used in numerous problems such as few-shot classification, reinforcement learning, and domain generalization. However, meta-learning models are prone to overfitting when there are no sufficient training tasks for the meta-learners to generalize. Although existing approaches such as Dropout are widely used to address the overfitting problem, these methods are typically designed for regularizing models of a single task in supervised training. In this paper, we introduce a simple yet effective method to alleviate the risk of overfitting for gradient-based meta-learning. Specifically, during the gradient-based adaptation stage, we randomly drop the gradient in the inner-loop optimization of each parameter in deep neural networks, such that the augmented gradients improve generalization to new tasks. We present a general form of the proposed gradient dropout regularization and show that this term can be sampled from either the Bernoulli or Gaussian distribution. To validate the proposed method, we conduct extensive experiments and analysis on numerous computer vision tasks, demonstrating that the gradient dropout regularization mitigates the overfitting problem and improves the performance upon various gradient-based meta-learning frameworks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题