论文标题
子管元学习
Submodular Meta-Learning
论文作者
论文摘要
在本文中,我们引入了元学习框架的离散变体。元学习旨在利用先前的经验和数据来提高未来任务的绩效。到目前为止,连续域中存在许多用于元学习的公式。值得注意的是,模型 - 反应式元学习(MAML)公式将每个任务视为一个连续的优化问题,并且基于先前的数据,可以将适当的初始化学习,可以在一些简单的梯度更新后适应新的,看不见的任务。在此术语中,我们在离散域中提出了一个新颖的元学习框架,在该域中,每个任务等于在基数约束下最大化设置函数。我们的方法旨在使用先前的数据,即先前访问的任务,以训练可以以相对较低的计算成本来快速适应新任务的适当初始解决方案集。这种方法会导致(i)针对每个任务的个性化解决方案,并且(ii)与一旦揭示了新任务后,解决方案完全优化了测试时间的计算成本。通过解决一个具有挑战性的离散优化问题来执行培训程序,我们提出确定性和随机算法。在任务是单调和子模型的情况下,即使训练目标可能不是少量的,我们为我们提出的方法显示出强大的理论保证。我们还证明了框架在两个现实世界中的问题实例上的有效性,我们观察到我们的方法在解决新任务时会显着降低计算复杂性,同时与任务完全优化的绩效相比,在产生较小的绩效损失。
In this paper, we introduce a discrete variant of the meta-learning framework. Meta-learning aims at exploiting prior experience and data to improve performance on future tasks. By now, there exist numerous formulations for meta-learning in the continuous domain. Notably, the Model-Agnostic Meta-Learning (MAML) formulation views each task as a continuous optimization problem and based on prior data learns a suitable initialization that can be adapted to new, unseen tasks after a few simple gradient updates. Motivated by this terminology, we propose a novel meta-learning framework in the discrete domain where each task is equivalent to maximizing a set function under a cardinality constraint. Our approach aims at using prior data, i.e., previously visited tasks, to train a proper initial solution set that can be quickly adapted to a new task at a relatively low computational cost. This approach leads to (i) a personalized solution for each individual task, and (ii) significantly reduced computational cost at test time compared to the case where the solution is fully optimized once the new task is revealed. The training procedure is performed by solving a challenging discrete optimization problem for which we present deterministic and randomized algorithms. In the case where the tasks are monotone and submodular, we show strong theoretical guarantees for our proposed methods even though the training objective may not be submodular. We also demonstrate the effectiveness of our framework on two real-world problem instances where we observe that our methods lead to a significant reduction in computational complexity in solving the new tasks while incurring a small performance loss compared to when the tasks are fully optimized.