与小批次混合线性回归的强大元学习

论文标题

与小批次混合线性回归的强大元学习

Robust Meta-learning for Mixed Linear Regression with Small Batches

论文作者

Kong, Weihao, Somani, Raghav, Kakade, Sham, Oh, Sewoong

论文摘要

实际监督学习中面临的一个普遍挑战，例如医学图像处理和机器人互动，是有很多任务，但是每个任务都无法收集足够的标记示例，可以孤立地学习。但是，通过利用这些任务之间的相似性，人们可以希望克服这种数据稀缺。在从k线性回归的混合物中汲取每个任务的规范场景下，我们研究一个基本问题：丰富的小数据任务能否弥补缺乏大数据的任务？现有的基于第二时刻的方法表明，借助$ω（k^{1/2}）$示例的中型任务，这种权衡是有效实现的。但是，在两种重要情况下，该算法是脆弱的。即使数据集中只有几个异常值，这些预测也可能是任意不良的（i）；或（ii）即使中型任务略小，使用$ O（k^{1/2}）$示例。我们介绍了一种在两种情况下同时强大的光谱方法。为此，我们首先设计了一种新颖的异常主体组件分析算法，该算法可实现最佳精度。接下来是一种方形算法，以利用高阶矩来利用信息。总之，这种方法对异常值是强大的，并实现了优雅的统计权衡。缺少$ω（k^{1/2}）$ - 大小任务可以用较小的任务来补偿，现在可以将其小至$ O（\ log k）$。

A common challenge faced in practical supervised learning, such as medical image processing and robotic interactions, is that there are plenty of tasks but each task cannot afford to collect enough labeled examples to be learned in isolation. However, by exploiting the similarities across those tasks, one can hope to overcome such data scarcity. Under a canonical scenario where each task is drawn from a mixture of k linear regressions, we study a fundamental question: can abundant small-data tasks compensate for the lack of big-data tasks? Existing second moment based approaches show that such a trade-off is efficiently achievable, with the help of medium-sized tasks with $Ω(k^{1/2})$ examples each. However, this algorithm is brittle in two important scenarios. The predictions can be arbitrarily bad (i) even with only a few outliers in the dataset; or (ii) even if the medium-sized tasks are slightly smaller with $o(k^{1/2})$ examples each. We introduce a spectral approach that is simultaneously robust under both scenarios. To this end, we first design a novel outlier-robust principal component analysis algorithm that achieves an optimal accuracy. This is followed by a sum-of-squares algorithm to exploit the information from higher order moments. Together, this approach is robust against outliers and achieves a graceful statistical trade-off; the lack of $Ω(k^{1/2})$-size tasks can be compensated for with smaller tasks, which can now be as small as $O(\log k)$.

下载PDF全文

下载文献需遵守相关版权规定

论文标题