积极地探索，保守地更新：随机外部方法，具有可变步骤缩放的随机外部方法

论文标题

积极地探索，保守地更新：随机外部方法，具有可变步骤缩放的随机外部方法

Explore Aggressively, Update Conservatively: Stochastic Extragradient Methods with Variable Stepsize Scaling

论文作者

Hsieh, Yu-Guan, Iutzeler, Franck, Malick, Jérôme, Mertikopoulos, Panayotis

论文摘要

由于它们的稳定性和收敛速度，外部方法已成为解决机器学习中大规模鞍点问题的主食。这些算法的基本前提是在执行更新之前使用外推步骤。得益于此探索步骤，额外毕业者的方法克服了困扰梯度下降/上升方案的许多非缔约问题。另一方面，正如我们在本文中所显示的那样，即使在简单的双线性模型中，也可能会危害其收敛性，从而危害其收敛性。为了克服这一故障，我们研究了一个两步尺寸的外部算法，与更新步骤相比，探索步骤以更具侵略性的时间尺度演变。我们表明，这种修改使该方法即使使用随机梯度也可以收敛，并且我们在误差结合条件下得出了急剧的收敛速率。

Owing to their stability and convergence speed, extragradient methods have become a staple for solving large-scale saddle-point problems in machine learning. The basic premise of these algorithms is the use of an extrapolation step before performing an update; thanks to this exploration step, extra-gradient methods overcome many of the non-convergence issues that plague gradient descent/ascent schemes. On the other hand, as we show in this paper, running vanilla extragradient with stochastic gradients may jeopardize its convergence, even in simple bilinear models. To overcome this failure, we investigate a double stepsize extragradient algorithm where the exploration step evolves at a more aggressive time-scale compared to the update step. We show that this modification allows the method to converge even with stochastic gradients, and we derive sharp convergence rates under an error bound condition.

下载PDF全文

下载文献需遵守相关版权规定

论文标题