关于随机牛顿算法的渐近率及其加权平均版本

论文标题

关于随机牛顿算法的渐近率及其加权平均版本

On the asymptotic rate of convergence of Stochastic Newton algorithms and their Weighted Averaged versions

论文作者

Boyer, Claire, Godichon-Baggioni, Antoine

论文摘要

大多数机器学习方法可以视为最小化不可用的风险功能。为了优化后者，给定以流方式提供的样品，我们定义了一般的随机牛顿算法及其加权平均版本。在几种用例中，这两种实现都将不需要在每次迭代时都需要对Hessian估计值进行反转，但是将对逆Hessian的估计值进行直接更新。这概括了[2]中针对逻辑回归的特定情况引入的技巧，直接更新逆Hessian的估计值。在诸如最佳局部强凸度之类的轻度假设下，我们几乎确定了算法的收敛和收敛速率，以及构造参数估计值的中央限制定理。本文中考虑的统一框架涵盖了线性，逻辑或软马克斯回归的情况，仅举几例。对模拟数据的数值实验给出了所提出方法的相关性的经验证据，这些方法的表现优于流行竞争者，尤其是在不良的初始化情况下。

The majority of machine learning methods can be regarded as the minimization of an unavailable risk function. To optimize the latter, given samples provided in a streaming fashion, we define a general stochastic Newton algorithm and its weighted average version. In several use cases, both implementations will be shown not to require the inversion of a Hessian estimate at each iteration, but a direct update of the estimate of the inverse Hessian instead will be favored. This generalizes a trick introduced in [2] for the specific case of logistic regression, by directly updating the estimate of the inverse Hessian. Under mild assumptions such as local strong convexity at the optimum, we establish almost sure convergences and rates of convergence of the algorithms, as well as central limit theorems for the constructed parameter estimates. The unified framework considered in this paper covers the case of linear, logistic or softmax regressions to name a few. Numerical experiments on simulated data give the empirical evidence of the pertinence of the proposed methods, which outperform popular competitors particularly in case of bad initializa-tions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题