基于梯度的元学习与深神经网的全球收敛和概括结合

论文标题

基于梯度的元学习与深神经网的全球收敛和概括结合

Global Convergence and Generalization Bound of Gradient-Based Meta-Learning with Deep Neural Nets

论文作者

Wang, Haoxiang, Sun, Ruoyu, Li, Bo

论文摘要

带有深神经网（DNN）的基于梯度的元学习（GBML）已成为少数学习的流行方法。但是，由于DNN的非转化性和GBML中的双层优化，因此GBML具有DNN的理论特性仍然很大。在本文中，我们首先要回答以下问题：具有DNN的GBML是否具有全球融合保证？我们通过证明具有过度参数化DNN的GBML保证以线性速率融合到全球Optima，为这个问题提供了积极的答案。我们要解决的第二个问题是：GBML如何实现对过去任务的先前经验的快速改编？为了回答这一点，我们从理论上表明，GBML等同于功能上的梯度下降操作，该功能梯度下降操作将从过去的任务转化为新任务的经验，然后我们证明了GBML与过度参数化DNN的概括性错误。

Gradient-based meta-learning (GBML) with deep neural nets (DNNs) has become a popular approach for few-shot learning. However, due to the non-convexity of DNNs and the bi-level optimization in GBML, the theoretical properties of GBML with DNNs remain largely unknown. In this paper, we first aim to answer the following question: Does GBML with DNNs have global convergence guarantees? We provide a positive answer to this question by proving that GBML with over-parameterized DNNs is guaranteed to converge to global optima at a linear rate. The second question we aim to address is: How does GBML achieve fast adaption to new tasks with prior experience on past tasks? To answer it, we theoretically show that GBML is equivalent to a functional gradient descent operation that explicitly propagates experience from the past tasks to new ones, and then we prove a generalization error bound of GBML with over-parameterized DNNs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题