随机梯度Langevin动力学的更快收敛用于非concave采样

论文标题

随机梯度Langevin动力学的更快收敛用于非concave采样

Faster Convergence of Stochastic Gradient Langevin Dynamics for Non-Log-Concave Sampling

论文作者

Zou, Difan, Xu, Pan, Gu, Quanquan

论文摘要

我们提供了随机梯度Langevin Dynamics（SGLD）的新收敛分析，以从可以是非concave的一类分布中进行采样。我们方法的核心是使用辅助时间可逆的马尔可夫链对SGLD进行了新的电导分析。在目标分布的某些条件下，我们证明$ \ tilde o（d^4ε^{ - 2}）$随机梯度评估足以保证$ε$ - 缩采样错误，而总变化距离，其中$ d $是问题维度。这提高了SGLD收敛速率的现有结果（Raginsky等，2017； Xu等，2018）。我们进一步表明，在对数密度函数上提供了额外的Hessian Lipschitz条件，SGLD可以保证在$ \ tilde O内实现$ε$ -SMPMPLING误差（D^{15/4}ε^{ - 3/2}）$随机梯度评估。我们的证明技术提供了一种研究基于Langevin的算法的收敛性的新方法，并阐明了基于快速随机梯度的采样算法的设计。

We provide a new convergence analysis of stochastic gradient Langevin dynamics (SGLD) for sampling from a class of distributions that can be non-log-concave. At the core of our approach is a novel conductance analysis of SGLD using an auxiliary time-reversible Markov Chain. Under certain conditions on the target distribution, we prove that $\tilde O(d^4ε^{-2})$ stochastic gradient evaluations suffice to guarantee $ε$-sampling error in terms of the total variation distance, where $d$ is the problem dimension. This improves existing results on the convergence rate of SGLD (Raginsky et al., 2017; Xu et al., 2018). We further show that provided an additional Hessian Lipschitz condition on the log-density function, SGLD is guaranteed to achieve $ε$-sampling error within $\tilde O(d^{15/4}ε^{-3/2})$ stochastic gradient evaluations. Our proof technique provides a new way to study the convergence of Langevin-based algorithms and sheds some light on the design of fast stochastic gradient-based sampling algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题