论文标题
可证明的更多数据在高维最小二乘估计器中受到伤害
Provable More Data Hurt in High Dimensional Least Squares Estimator
论文作者
论文摘要
本文研究了高维最小二乘估计器的有限样本预测风险。当样本量和特征数量倾向于无穷大时,我们得出了预测风险的中央限制定理。此外,还提供了预测风险的有限样本分布和置信区间。我们的理论结果证明了预测风险的样本非单调性,并确认“更多的数据伤害”现象。
This paper investigates the finite-sample prediction risk of the high-dimensional least squares estimator. We derive the central limit theorem for the prediction risk when both the sample size and the number of features tend to infinity. Furthermore, the finite-sample distribution and the confidence interval of the prediction risk are provided. Our theoretical results demonstrate the sample-wise nonmonotonicity of the prediction risk and confirm "more data hurt" phenomenon.