论文标题
带有频繁方向的山脊回归:统计和优化观点
Ridge Regression with Frequent Directions: Statistical and Optimization Perspectives
论文作者
论文摘要
尽管具有令人印象深刻的理论\&实践表现,但频繁的方向(\ acrshort {fd})并未被广泛用于大规模回归任务。先前的工作显示了随机草图(i)在估计数据的协方差矩阵方面的性能要比\ acrshort {fd}; (ii)估计草脊回归的偏差和/或方差时会产生高误差。我们使用\ acrshort {fd}给出了第一个常数因子相对误差界限的偏差\&差异。我们通过证明\ acrshort {fd}可以通过迭代方案在优化设置中使用\ acrshort {fd}来补充这些统计结果。这改善了随机方法,这些方法需要损害对每次迭代的新草图的需求,并以收敛的速度。在这两种设置中,我们还使用\ emph {robust频繁方向}展示进一步增强性能。
Despite its impressive theory \& practical performance, Frequent Directions (\acrshort{fd}) has not been widely adopted for large-scale regression tasks. Prior work has shown randomized sketches (i) perform worse in estimating the covariance matrix of the data than \acrshort{fd}; (ii) incur high error when estimating the bias and/or variance on sketched ridge regression. We give the first constant factor relative error bounds on the bias \& variance for sketched ridge regression using \acrshort{fd}. We complement these statistical results by showing that \acrshort{fd} can be used in the optimization setting through an iterative scheme which yields high-accuracy solutions. This improves on randomized approaches which need to compromise the need for a new sketch every iteration with speed of convergence. In both settings, we also show using \emph{Robust Frequent Directions} further enhances performance.