合成：一种半同步路径综合的随机梯度方法，用于计算簇中的分布式学习

论文标题

合成：一种半同步路径综合的随机梯度方法，用于计算簇中的分布式学习

SYNTHESIS: A Semi-Asynchronous Path-Integrated Stochastic Gradient Method for Distributed Learning in Computing Clusters

论文作者

Liu, Zhuqing, Zhang, Xin, Liu, Jia

论文摘要

为了提高分布式学习的训练速度，近年来见证了人们对开发同步和异步分布式随机方差减少优化方法的浓厚兴趣。但是，所有现有的同步和异步分布式训练算法都遭受了收敛速度或实施复杂性的各种局限性。这促使我们提出了一种称为STNSESIS的算法（半同步路径集成随机梯度搜索），该算法利用方差还原框架的特殊结构来克服同步和异步分布式分布式学习算法的局限性，同时又在恢复其显着特征。我们考虑在分布式和共享内存体系结构下进行的两种实现。 We show that our STNTHESIS algorithms have $O(\sqrt{N}ε^{-2}(Δ+1)+N)$ and $O(\sqrt{N}ε^{-2}(Δ+1) d+N)$ computational complexities for achieving an $ε$-stationary point in non-convex learning under distributed and shared memory architectures,分别表示n表示培训样本的总数和$δ$表示工人的最大延迟。此外，我们通过建立二次强烈凸和非convex优化的算法稳定性界限来研究\ algname的概括性能。我们进一步进行广泛的数值实验来验证我们的理论发现

To increase the training speed of distributed learning, recent years have witnessed a significant amount of interest in developing both synchronous and asynchronous distributed stochastic variance-reduced optimization methods. However, all existing synchronous and asynchronous distributed training algorithms suffer from various limitations in either convergence speed or implementation complexity. This motivates us to propose an algorithm called STNTHESIS (semi-asynchronous path-integrated stochastic gradient search), which leverages the special structure of the variance-reduction framework to overcome the limitations of both synchronous and asynchronous distributed learning algorithms while retaining their salient features. We consider two implementations of STNTHESIS under distributed and shared memory architectures. We show that our STNTHESIS algorithms have $O(\sqrt{N}ε^{-2}(Δ+1)+N)$ and $O(\sqrt{N}ε^{-2}(Δ+1) d+N)$ computational complexities for achieving an $ε$-stationary point in non-convex learning under distributed and shared memory architectures, respectively, where N denotes the total number of training samples and $Δ$ represents the maximum delay of the workers. Moreover, we investigate the generalization performance of \algname by establishing algorithmic stability bounds for quadratic strongly convex and non-convex optimization. We further conduct extensive numerical experiments to verify our theoretical findings

下载PDF全文

下载文献需遵守相关版权规定

论文标题