朝向样品有效的过度参数化元学习

论文标题

朝向样品有效的过度参数化元学习

Towards Sample-efficient Overparameterized Meta-learning

论文作者

Sun, Yue, Narang, Adhyyan, Gulluk, Halil Ibrahim, Oymak, Samet, Fazel, Maryam

论文摘要

机器学习的总体目标是建立一个可概括的模型，几乎没有样本。为此，即使数据集的大小小于模型的大小，过度参数化也是解释深网的概括能力的引人注目的主题。虽然先前的文献侧重于经典的监督环境，但本文旨在使元学习过度参数化。在这里，我们有一系列线性回归任务，我们问：（1）给定的任务，新的下游任务的功能的最佳线性表示是什么？（2）我们需要多少个样本来构建此表示形式？这项工作表明，令人惊讶的是，过度参数是对这些基本元学习问题的自然答案。具体而言，对于（1），我们首先表明学习最佳表示与设计任务感知正规化以促进电感偏见的问题一致。我们利用这种归纳性偏见来解释下游任务实际上如何从过度参数化中受益，而不是在几次学习中进行的工作。对于（2），我们开发了一种理论来解释特征协方差如何隐式地降低样本复杂性，远低于自由度并导致估计误差小。然后，我们整合了这些发现，以获得我们的元学习算法的总体性能保证。关于真实和合成数据的数值实验验证了我们对过度参数化元学习的见解。

An overarching goal in machine learning is to build a generalizable model with few samples. To this end, overparameterization has been the subject of immense interest to explain the generalization ability of deep nets even when the size of the dataset is smaller than that of the model. While the prior literature focuses on the classical supervised setting, this paper aims to demystify overparameterization for meta-learning. Here we have a sequence of linear-regression tasks and we ask: (1) Given earlier tasks, what is the optimal linear representation of features for a new downstream task? and (2) How many samples do we need to build this representation? This work shows that surprisingly, overparameterization arises as a natural answer to these fundamental meta-learning questions. Specifically, for (1), we first show that learning the optimal representation coincides with the problem of designing a task-aware regularization to promote inductive bias. We leverage this inductive bias to explain how the downstream task actually benefits from overparameterization, in contrast to prior works on few-shot learning. For (2), we develop a theory to explain how feature covariance can implicitly help reduce the sample complexity well below the degrees of freedom and lead to small estimation error. We then integrate these findings to obtain an overall performance guarantee for our meta-learning algorithm. Numerical experiments on real and synthetic data verify our insights on overparameterized meta-learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题