活跃的本地学习

论文标题

活跃的本地学习

Active Local Learning

论文作者

Backurs, Arturs, Blum, Avrim, Gupta, Neha

论文摘要

在这项工作中，我们考虑了活跃的本地学习：考虑到一个查询点$ x $，并且积极访问未标记的培训套件$ s $，输出h $中近距离的$ h \的预测$ h（x）$使用明显少于实际学习$ h $的标签所需的标签要少得多。特别是，标签查询的数量应独立于$ h $的复杂性，并且功能$ h $的定义应明确，独立于$ x $。这也立即意味着用于距离估计的算法：通过在几个随机查询点上运行本地学习并计算平均错误，估算了比实际学习近乎最佳的$ h \所需标签的值$ opt（h）$。对于由$ l $界限的间隔$ [0,1] $组成的假设类别，我们提出了一种算法，该算法使$ o（（（{1 /ε^6}）\ log（1 /ε）$ label查询$（$ o（$ o（（log））$ o（$ o（（l /ε），它估计了课堂中最佳假设的距离，即任意基础分布的$ε$的添加误差。我们将算法进一步推广到一个多维的算法。我们强调，所使用的标签数量与依赖$ L $的假设类别的复杂性无关。此外，我们给出了一种算法，以局部估算几乎最佳功能的值，并在几个兴趣点上使用独立于$ l $的标签数量。我们还考虑了近似于线性对角线转换下的Nadaraya-Watson估计器可以实现的最小误差的相关问题，而特征值来自很小的范围。对于$ d $二维的尺寸$ n $的点集，我们的算法实现了$ε$的添加近似值，使$ \ tilde {o}（{d}/{ε^2}）$查询并运行$ \ tilde {o}（{d^2}/{ε^{d+4}}}+{dn}/{ε^2}）$ time。

In this work we consider active local learning: given a query point $x$, and active access to an unlabeled training set $S$, output the prediction $h(x)$ of a near-optimal $h \in H$ using significantly fewer labels than would be needed to actually learn $h$ fully. In particular, the number of label queries should be independent of the complexity of $H$, and the function $h$ should be well-defined, independent of $x$. This immediately also implies an algorithm for distance estimation: estimating the value $opt(H)$ from many fewer labels than needed to actually learn a near-optimal $h \in H$, by running local learning on a few random query points and computing the average error. For the hypothesis class consisting of functions supported on the interval $[0,1]$ with Lipschitz constant bounded by $L$, we present an algorithm that makes $O(({1 / ε^6}) \log(1/ε))$ label queries from an unlabeled pool of $O(({L / ε^4})\log(1/ε))$ samples. It estimates the distance to the best hypothesis in the class to an additive error of $ε$ for an arbitrary underlying distribution. We further generalize our algorithm to more than one dimensions. We emphasize that the number of labels used is independent of the complexity of the hypothesis class which depends on $L$. Furthermore, we give an algorithm to locally estimate the values of a near-optimal function at a few query points of interest with number of labels independent of $L$. We also consider the related problem of approximating the minimum error that can be achieved by the Nadaraya-Watson estimator under a linear diagonal transformation with eigenvalues coming from a small range. For a $d$-dimensional pointset of size $N$, our algorithm achieves an additive approximation of $ε$, makes $\tilde{O}({d}/{ε^2})$ queries and runs in $\tilde{O}({d^2}/{ε^{d+4}}+{dN}/{ε^2})$ time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题