在线主动回归

论文标题

在线主动回归

Online Active Regression

论文作者

Chen, Cheng, Li, Yi, Sun, Yiming

论文摘要

主动回归考虑了一个线性回归问题，其中学习者会收到大量数据点，但只能观察到少数标签。由于在线算法可以处理增量培训数据并利用低计算成本，因此我们考虑了主动回归问题的在线扩展：学习者一一接收数据点，并立即决定是否应该收集相应的标签。目的是有效地维护收到的数据点的回归，并具有少量的标签查询回归。我们在$ \ ell_p $损失下为此问题提出了新算法，其中$ p \ in [1,2] $。要实现$（1+ε）$ - 近似解决方案，我们提出的算法仅需要$ \ tilde {\ Mathcal {o}}}（ε^{ - 1} d \ log（nbog（nk））$查询标签的查询，其中$ n $是数据点的数量，$κ$是$κ$是条件点，$κ$是一个data nordity nordity n data data nordity nordity nordity nordity nordity nordity nordity nordation nordation nordation nordation nordation nordation nord data。数值结果验证了我们的理论结果，并表明我们的方法与离线活性回归算法具有可比性的性能。

Active regression considers a linear regression problem where the learner receives a large number of data points but can only observe a small number of labels. Since online algorithms can deal with incremental training data and take advantage of low computational cost, we consider an online extension of the active regression problem: the learner receives data points one by one and immediately decides whether it should collect the corresponding labels. The goal is to efficiently maintain the regression of received data points with a small budget of label queries. We propose novel algorithms for this problem under $\ell_p$ loss where $p\in[1,2]$. To achieve a $(1+ε)$-approximate solution, our proposed algorithms only require $\tilde{\mathcal{O}}(ε^{-1} d \log(nκ))$ queries of labels, where $n$ is the number of data points and $κ$ is a quantity, called the condition number, of the data points. The numerical results verify our theoretical results and show that our methods have comparable performance with offline active regression algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题