论文标题

关于耗散符号整合,并应用于基于梯度的优化

On dissipative symplectic integration with applications to gradient-based optimization

论文作者

França, Guilherme, Jordan, Michael I., Vidal, René

论文摘要

最近,已证明连续的时间动力系统在提供基于梯度的优化的概念和定量见解方面很有用,该优化广泛用于现代机器学习和统计中。在这一工作中出现的一个重要问题是如何以保持其稳定性和收敛速度的方式离散系统。在本文中,我们提出了一个几何框架,可以系统地实现此类离散化,从而无需离散收敛分析即可推导“速率匹配”算法。更具体地说,我们表明,符号积分器对非保守性,尤其是耗散性的哈密顿系统的概括能够将收敛速率保存到受控误差。此外,尽管没有保护法,但这种方法仍保留了阴影哈密顿量,将符号整合物的关键结果扩展到非保守案例。我们的论点依赖于向后误差分析的结合以及象征几何形状的基本结果。我们强调的是,尽管这项工作的原始动机是对优化的应用,而耗散系统起着自然作用,但它们是完全笼统的,不仅为耗散性的哈密顿系统提供了差异的几何框架,而且还实质上扩展了结构掩盖的理论。

Recently, continuous-time dynamical systems have proved useful in providing conceptual and quantitative insights into gradient-based optimization, widely used in modern machine learning and statistics. An important question that arises in this line of work is how to discretize the system in such a way that its stability and rates of convergence are preserved. In this paper we propose a geometric framework in which such discretizations can be realized systematically, enabling the derivation of "rate-matching" algorithms without the need for a discrete convergence analysis. More specifically, we show that a generalization of symplectic integrators to nonconservative and in particular dissipative Hamiltonian systems is able to preserve rates of convergence up to a controlled error. Moreover, such methods preserve a shadow Hamiltonian despite the absence of a conservation law, extending key results of symplectic integrators to nonconservative cases. Our arguments rely on a combination of backward error analysis with fundamental results from symplectic geometry. We stress that although the original motivation for this work was the application to optimization, where dissipative systems play a natural role, they are fully general and not only provide a differential geometric framework for dissipative Hamiltonian systems but also substantially extend the theory of structure-preserving integration.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源