论文标题
绘制了牛顿 - 拉夫森
Sketched Newton-Raphson
论文作者
论文摘要
我们提出了一种新的全球收敛随机二阶方法。我们的起点是开发了一种新的草图的牛顿 - 拉夫森(SNR)方法,用于求解与$ f(x)= 0 $的大型非线性方程,并使用$ f:\ mathbb {r}^p \ rightArrow \ rightArrow \ rightarrow \ mathbb {r}^m $。然后,我们通过重新编写感兴趣的优化问题作为非线性方程系统并应用SNR来展示如何设计几种随机二阶优化方法。例如,通过应用SNR找到广义线性模型(GLM)的固定点,我们得出了全新且可扩展的随机二阶方法。我们表明,与最新方差减少方法相比,所得的方法非常有竞争力。此外,使用可变分割技巧,我们还表明,随机牛顿方法(SNM)是SNR的特殊情况,并使用此连接来建立SNM的第一个全局收敛理论。 我们通过证明它是随机梯度下降(SGD)方法的变体,然后利用SGD的证明技术来建立SNR的全局收敛性。作为一种特殊情况,与经典单调收敛理论相比,我们的理论还为原始的牛顿 - 拉夫森方法提供了一种新的全球收敛理论。
We propose a new globally convergent stochastic second order method. Our starting point is the development of a new Sketched Newton-Raphson (SNR) method for solving large scale nonlinear equations of the form $F(x)=0$ with $F:\mathbb{R}^p \rightarrow \mathbb{R}^m$. We then show how to design several stochastic second order optimization methods by re-writing the optimization problem of interest as a system of nonlinear equations and applying SNR. For instance, by applying SNR to find a stationary point of a generalized linear model (GLM), we derive completely new and scalable stochastic second order methods. We show that the resulting method is very competitive as compared to state-of-the-art variance reduced methods. Furthermore, using a variable splitting trick, we also show that the Stochastic Newton method (SNM) is a special case of SNR, and use this connection to establish the first global convergence theory of SNM. We establish the global convergence of SNR by showing that it is a variant of the stochastic gradient descent (SGD) method, and then leveraging proof techniques of SGD. As a special case, our theory also provides a new global convergence theory for the original Newton-Raphson method under strictly weaker assumptions as compared to the classic monotone convergence theory.