论文标题

退休:高维度的强劲预期回归

Retire: Robust Expectile Regression in High Dimensions

论文作者

Man, Rebeka, Tan, Kean Ming, Wang, Zian, Zhou, Wen-Xin

论文摘要

高维数据通常由于异质方差或不均匀协变量效应而显示出异质性。惩罚的分位数和预期回归方法提供了有用的工具来检测高维数据中的异质性。由于检查损失的非平滑性质,前者在计算上具有挑战性,后者对重尾误差分布敏感。在本文中,我们提出和研究(惩罚)强大的预期回归(退休),重点是迭代重新加权的$ \ ell_1 $ - 二元化,从而从$ \ ell_1 $ penalization降低了估计偏见,并导致了甲骨文物业。从理论上讲,我们在两个制度下建立了退休估计器的统计属性:(i)低维度,其中$ d \ ll n $; (ii)高维度,其中$ s \ ll n \ ll d $带有$ s $表示重要预测变量的数量。在高维设置中,我们仔细地表征了迭代重新持续的$ \ ell_1 $ penalized退休估计的解决方案路径,该估计是根据折叠式链式正则化的局部线性近似算法改编的。在轻度的最小信号强度条件下,我们表明,在多达$ \ log(\ log d)$迭代之后,最终迭代符合Oracle收敛速度。在每次迭代中,加权$ \ ell_1 $ penaLized凸面程序可以通过半齿牛顿坐标下降算法有效地解决。数值研究表明,与基于非体回归或基于分数回归的替代方案相比,提出的程序的竞争性能。

High-dimensional data can often display heterogeneity due to heteroscedastic variance or inhomogeneous covariate effects. Penalized quantile and expectile regression methods offer useful tools to detect heteroscedasticity in high-dimensional data. The former is computationally challenging due to the non-smooth nature of the check loss, and the latter is sensitive to heavy-tailed error distributions. In this paper, we propose and study (penalized) robust expectile regression (retire), with a focus on iteratively reweighted $\ell_1$-penalization which reduces the estimation bias from $\ell_1$-penalization and leads to oracle properties. Theoretically, we establish the statistical properties of the retire estimator under two regimes: (i) low-dimensional regime in which $d \ll n$; (ii) high-dimensional regime in which $s\ll n\ll d$ with $s$ denoting the number of significant predictors. In the high-dimensional setting, we carefully characterize the solution path of the iteratively reweighted $\ell_1$-penalized retire estimation, adapted from the local linear approximation algorithm for folded-concave regularization. Under a mild minimum signal strength condition, we show that after as many as $\log(\log d)$ iterations the final iterate enjoys the oracle convergence rate. At each iteration, the weighted $\ell_1$-penalized convex program can be efficiently solved by a semismooth Newton coordinate descent algorithm. Numerical studies demonstrate the competitive performance of the proposed procedure compared with either non-robust or quantile regression based alternatives.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源