论文标题

频谱最小二乘类型方法,用于重尾损坏回归,其协方差\&异质噪声

A spectral least-squares-type method for heavy-tailed corrupted regression with unknown covariance \& heterogeneous noise

论文作者

Oliveira, Roberto I., Rico, Zoraida F., Thompson, Philip

论文摘要

我们重新访问了损坏的最小二乘线性回归,假设最多损坏了$ n $ n $ n $ n $ n $ n $ n $ n $ h $εn$ thuny $εn$ tunionary Outliers的样本。我们希望估计给定标签 - 功能对$(y,x)$满足$ y = \ y = \ langle x,b^*\ rangle+ξ$带有重尾$(x,ξ)$的标签 - 功能对$(y,x)$的样本的样本。我们只假设$ x $ as $ l^4-l^2 $超债券,$ l> 0 $,并且具有最低特征值$ 1/μ^2> 0 $的协方差矩阵$σ$,并且有界条件编号$κ> 0 $。噪声$ξ$可以任意取决于$ x $,只要$ξx$具有有限的协方差矩阵$ξ$,就可以任意取决于$ x $。我们根据功率方法提出了一个近乎最佳的计算估计器,假设对$(σ,ξ)$也不了解$ξ$的运算符规范。我们提出的估计器以至少$ 1-δ$的概率达到统计率$μ^2 \vertξ\ vert^{1/2}(\ frac {p} {p} {n} {n}+\ frac {\ frac {\ log(1/δ)} $ \ sim \ frac {1} {l^4κ^2} $,在$ \ ell_2 $ -norm中都具有最佳状态,假设近乎最小的最小样本大小$ l^4κ2(p \ log p + p + log p + log log(1/δ))据我们所知,这是同时满足所有提到的所有属性的第一个计算障碍算法。我们的估计器基于两阶段的乘量重量更新算法。第一阶段估计了(未知)预先条件的内部产品$ \langleς(\ cdot),\ cdot \ rangle $。第二阶段估计下降方向$σ\ hat v $相对于(已知的)内部产品$ \ langle \ cdot,\ cdot \ rangle $,而无需不了解或估计$σ$。

We revisit heavy-tailed corrupted least-squares linear regression assuming to have a corrupted $n$-sized label-feature sample of at most $εn$ arbitrary outliers. We wish to estimate a $p$-dimensional parameter $b^*$ given such sample of a label-feature pair $(y,x)$ satisfying $y=\langle x,b^*\rangle+ξ$ with heavy-tailed $(x,ξ)$. We only assume $x$ is $L^4-L^2$ hypercontractive with constant $L>0$ and has covariance matrix $Σ$ with minimum eigenvalue $1/μ^2>0$ and bounded condition number $κ>0$. The noise $ξ$ can be arbitrarily dependent on $x$ and nonsymmetric as long as $ξx$ has finite covariance matrix $Ξ$. We propose a near-optimal computationally tractable estimator, based on the power method, assuming no knowledge on $(Σ,Ξ)$ nor the operator norm of $Ξ$. With probability at least $1-δ$, our proposed estimator attains the statistical rate $μ^2\VertΞ\Vert^{1/2}(\frac{p}{n}+\frac{\log(1/δ)}{n}+ε)^{1/2}$ and breakdown-point $ε\lesssim\frac{1}{L^4κ^2}$, both optimal in the $\ell_2$-norm, assuming the near-optimal minimum sample size $L^4κ^2(p\log p + \log(1/δ))\lesssim n$, up to a log factor. To the best of our knowledge, this is the first computationally tractable algorithm satisfying simultaneously all the mentioned properties. Our estimator is based on a two-stage Multiplicative Weight Update algorithm. The first stage estimates a descent direction $\hat v$ with respect to the (unknown) pre-conditioned inner product $\langleΣ(\cdot),\cdot\rangle$. The second stage estimate the descent direction $Σ\hat v$ with respect to the (known) inner product $\langle\cdot,\cdot\rangle$, without knowing nor estimating $Σ$.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源