论文标题

具有分散训练的动量的周期性随机梯度下降

Periodic Stochastic Gradient Descent with Momentum for Decentralized Training

论文作者

Gao, Hongchang, Huang, Heng

论文摘要

近年来,分散的培训已积极研究。尽管已经提出了多种方法,但是分散的动量SGD方法仍未得到充实。在本文中,我们提出了一种新颖的定期分散动量SGD方法,该方法采用了动量模式和定期交流进行分散培训。借助这两种策略以及分散培训系统的拓扑,我们提出的方法的理论收敛分析很困难。我们解决了这个具有挑战性的问题,并提供了我们提出的方法可以实现有关工人数量的线性加速的条件。此外,我们还引入了一个沟通效率的变体,以降低每个通信中的通信成本。还为此变体提供了实现线性加速的条件。据我们所知,这两种方法都是第一个在相应的领域中实现这些理论结果的第一个方法。我们进行了广泛的实验来验证我们提出的两种方法的性能,并且它们俩都表现出优于现有方法的性能。

Decentralized training has been actively studied in recent years. Although a wide variety of methods have been proposed, yet the decentralized momentum SGD method is still underexplored. In this paper, we propose a novel periodic decentralized momentum SGD method, which employs the momentum schema and periodic communication for decentralized training. With these two strategies, as well as the topology of the decentralized training system, the theoretical convergence analysis of our proposed method is difficult. We address this challenging problem and provide the condition under which our proposed method can achieve the linear speedup regarding the number of workers. Furthermore, we also introduce a communication-efficient variant to reduce the communication cost in each communication round. The condition for achieving the linear speedup is also provided for this variant. To the best of our knowledge, these two methods are all the first ones achieving these theoretical results in their corresponding domain. We conduct extensive experiments to verify the performance of our proposed two methods, and both of them have shown superior performance over existing methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源