懒惰和近似双重梯度的分散学习

论文标题

懒惰和近似双重梯度的分散学习

Decentralized Learning with Lazy and Approximate Dual Gradients

论文作者

Liu, Yanli, Sun, Yuejiao, Yin, Wotao

论文摘要

本文开发了通过网络上的分散机器学习的算法，在该网络上分发数据，计算是本地化的，并且在邻居之间限制了通信。该领域的最新研究重点是改善计算和通信复杂性。当物镜光滑且强烈凸出时，方法ssda和msda \ cite {scaman2017optimal}具有最佳的通信复杂性，并且易于得出。但是，他们需要在每个步骤解决子问题。我们提出了新的算法，这些算法通过使用（随机）梯度来保存计算，并在以前的信息充分有用时节省通信。我们的方法仍然相对简单 - 而不是解决子问题，而是为了从最新观点提供少量固定数量的步骤。易于计算的本地规则用于决定工人是否可以跳过一轮通信。此外，我们的方法证明可以减少SSDA和MSDA的通信和计算复杂性。在数值实验中，我们的算法与最先进的算法相比实现了重大的计算和沟通降低。

This paper develops algorithms for decentralized machine learning over a network, where data are distributed, computation is localized, and communication is restricted between neighbors. A line of recent research in this area focuses on improving both computation and communication complexities. The methods SSDA and MSDA \cite{scaman2017optimal} have optimal communication complexity when the objective is smooth and strongly convex, and are simple to derive. However, they require solving a subproblem at each step. We propose new algorithms that save computation through using (stochastic) gradients and saves communications when previous information is sufficiently useful. Our methods remain relatively simple -- rather than solving a subproblem, they run Katyusha for a small, fixed number of steps from the latest point. An easy-to-compute, local rule is used to decide if a worker can skip a round of communication. Furthermore, our methods provably reduce communication and computation complexities of SSDA and MSDA. In numerical experiments, our algorithms achieve significant computation and communication reduction compared with the state-of-the-art.

下载PDF全文

下载文献需遵守相关版权规定

论文标题