亚当型算法的新遗憾分析

论文标题

亚当型算法的新遗憾分析

A new regret analysis for Adam-type algorithms

论文作者

Alacaoglu, Ahmet, Malitsky, Yura, Mertikopoulos, Panayotis, Cevher, Volkan

论文摘要

在本文中，我们关注亚当及其变体的理论实践差距（Amsgrad，Adamnc等）。实际上，这些算法与恒定的一阶矩参数$β_{1} $一起使用（通常在$ 0.9 $和$ 0.99 $之间）。从理论上讲，遗憾的保证在线凸优化需要快速衰减的$β_{1} \ to0 $计划。我们表明，这是标准分析的工件，并提出了一个新型框架，该框架使我们能够以恒定的$β_{1} $得出最佳的，数据依赖的后悔界限，而无需进一步的假设。我们还证明了我们对各种不同算法和设置的分析的灵活性。

In this paper, we focus on a theory-practice gap for Adam and its variants (AMSgrad, AdamNC, etc.). In practice, these algorithms are used with a constant first-order moment parameter $β_{1}$ (typically between $0.9$ and $0.99$). In theory, regret guarantees for online convex optimization require a rapidly decaying $β_{1}\to0$ schedule. We show that this is an artifact of the standard analysis and propose a novel framework that allows us to derive optimal, data-dependent regret bounds with a constant $β_{1}$, without further assumptions. We also demonstrate the flexibility of our analysis on a wide range of different algorithms and settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题