论文标题

通过结构化的准牛顿方法增强曲率信息

Enhance Curvature Information by Structured Stochastic Quasi-Newton Methods

论文作者

Yang, Minghan, Xu, Dong, Chen, Hongyu, Wen, Zaiwen, Chen, Mengyun

论文摘要

在本文中,我们考虑了随机二阶方法,以最大程度地减少非凸函数的有限总和。一个重要的关键是找到一个巧妙但廉价的计划,以结合局部曲率信息。由于真正的Hessian Matrix通常是廉价部分和昂贵的部分的组合,因此我们通过尽可能多地使用部分Hessian信息来提出一种结构化的随机化准Newton方法。通过进一步利用准Newton近似的低级别结构或Kronecker-rodoductions,准Newton方向的计算是负担得起的。在某些温和的假设下,建立了全球融合到固定点和局部超级线性收敛速率。关于逻辑回归,深度自动编码器网络和深度卷积神经网络的数值结果表明,我们所提出的方法对最新方法具有很高的竞争力。

In this paper, we consider stochastic second-order methods for minimizing a finite summation of nonconvex functions. One important key is to find an ingenious but cheap scheme to incorporate local curvature information. Since the true Hessian matrix is often a combination of a cheap part and an expensive part, we propose a structured stochastic quasi-Newton method by using partial Hessian information as much as possible. By further exploiting either the low-rank structure or the kronecker-product properties of the quasi-Newton approximations, the computation of the quasi-Newton direction is affordable. Global convergence to stationary point and local superlinear convergence rate are established under some mild assumptions. Numerical results on logistic regression, deep autoencoder networks and deep convolutional neural networks show that our proposed method is quite competitive to the state-of-the-art methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源