论文标题
用于沟通效率分布式学习的错误反馈的更好替代方案
A Better Alternative to Error Feedback for Communication-Efficient Distributed Learning
论文作者
论文摘要
现代的大规模机器学习应用需要在分布式计算系统上实现随机优化算法。此类系统的关键瓶颈是交流跨工人(例如随机梯度)交换信息的交流开销。在旨在纠正此问题的众多技术中,最成功的是与错误反馈(EF)的压缩通信框架。 EF仍然是唯一可以处理没有公正的承包压缩机引起的错误的已知技术,例如$ $ k $。在本文中,我们提出了一种新的,理论上,实际上更好的EF替代方案,用于处理承包压缩机。特别是,我们提出了一种可以将任何承包压缩机转化为诱导的无偏压缩机的结构。随后,可以应用能够与无偏压缩机一起使用的现有方法。我们表明,我们的方法会导致对EF的巨大改进,包括减少记忆需求,更好的沟通复杂性保证和更少的假设。我们将结果进一步扩展到在节点上进行任意分布后部分参与的联合学习,并证明其益处。我们执行几个验证我们的理论发现的数值实验。
Modern large-scale machine learning applications require stochastic optimization algorithms to be implemented on distributed compute systems. A key bottleneck of such systems is the communication overhead for exchanging information across the workers, such as stochastic gradients. Among the many techniques proposed to remedy this issue, one of the most successful is the framework of compressed communication with error feedback (EF). EF remains the only known technique that can deal with the error induced by contractive compressors which are not unbiased, such as Top-$K$. In this paper, we propose a new and theoretically and practically better alternative to EF for dealing with contractive compressors. In particular, we propose a construction which can transform any contractive compressor into an induced unbiased compressor. Following this transformation, existing methods able to work with unbiased compressors can be applied. We show that our approach leads to vast improvements over EF, including reduced memory requirements, better communication complexity guarantees and fewer assumptions. We further extend our results to federated learning with partial participation following an arbitrary distribution over the nodes, and demonstrate the benefits thereof. We perform several numerical experiments which validate our theoretical findings.