通过学习表示形式，很少学习

论文标题

通过学习表示形式，很少学习

Few-Shot Learning via Learning the Representation, Provably

论文作者

Du, Simon S., Hu, Wei, Kakade, Sham M., Lee, Jason D., Lei, Qi

论文摘要

本文通过表示学习来研究很少的学习，其中人们使用$ t $源任务，其中每任任务$ n_1 $数据来学习表示形式，以减少仅有$ n_2（\ ll n_1）$数据的目标任务的样本复杂性。具体来说，我们专注于在源和目标之间存在良好的\ emph {comomph {prom emph {coment {coment {coment}的设置，而我们的目标是了解减少样本量的可能性。首先，我们研究此设置，在该设置中，该共同表示为低维，并提供$ o \ left的快速速率（\ frac {\ Mathcal {c} \ left（φ\ right）} {n_1t} + \ frac {k} + \ frac {k}在这里，$φ$是表示函数类，$ \ MATHCAL {C} \ left（φ\右）$是其复杂度度量，而$ k $是表示的维度。当专门针对线性表示功能时，此速率变为$ o \ left（\ frac {dk} {n_1t} + \ frac {k} {n_2} {n_2} \ right）$，其中$ d（\ gg k）$是环境输入维度，这是超过代表率的速率，而不是使用代表率，而不是使用代表率。 $ o \ left（\ frac {d} {n_2} \ right）$。该结果绕过I.I.D.下的$ω（\ frac {1} {t}）$ barrier。任务假设，并可以捕获所需的属性，从源任务中所有$ n_1t $样本都可以\ emph {bomed}一起进行表示。接下来，我们考虑共同表示可能是高维但受能力约束的设置（例如，在规范中）；在这里，我们再次证明了在高维线性回归和神经网络学习中表示学习的优势。我们的结果表明，表示学习可以充分利用来自源任务的所有$ N_1T $样本。

This paper studies few-shot learning via representation learning, where one uses $T$ source tasks with $n_1$ data per task to learn a representation in order to reduce the sample complexity of a target task for which there is only $n_2 (\ll n_1)$ data. Specifically, we focus on the setting where there exists a good \emph{common representation} between source and target, and our goal is to understand how much of a sample size reduction is possible. First, we study the setting where this common representation is low-dimensional and provide a fast rate of $O\left(\frac{\mathcal{C}\left(Φ\right)}{n_1T} + \frac{k}{n_2}\right)$; here, $Φ$ is the representation function class, $\mathcal{C}\left(Φ\right)$ is its complexity measure, and $k$ is the dimension of the representation. When specialized to linear representation functions, this rate becomes $O\left(\frac{dk}{n_1T} + \frac{k}{n_2}\right)$ where $d (\gg k)$ is the ambient input dimension, which is a substantial improvement over the rate without using representation learning, i.e. over the rate of $O\left(\frac{d}{n_2}\right)$. This result bypasses the $Ω(\frac{1}{T})$ barrier under the i.i.d. task assumption, and can capture the desired property that all $n_1T$ samples from source tasks can be \emph{pooled} together for representation learning. Next, we consider the setting where the common representation may be high-dimensional but is capacity-constrained (say in norm); here, we again demonstrate the advantage of representation learning in both high-dimensional linear regression and neural network learning. Our results demonstrate representation learning can fully utilize all $n_1T$ samples from source tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题