自适应加权的快速结合联合学习

论文标题

自适应加权的快速结合联合学习

Fast-Convergent Federated Learning with Adaptive Weighting

论文作者

Wu, Hongda, Wang, Ping

论文摘要

联合学习（FL）使资源受限的边缘节点能够在中央服务器的编排下协作学习全球模型，同时在本地保留对隐私敏感的数据。跨参与节点的非独立且相同分布的（非IID）数据样本缓慢模型训练，并施加了其他通信弹以使FL融合。在本文中，我们提出了联合自适应加权（FedAdp）算法，该算法旨在在存在非IID数据集的节点的情况下加速模型收敛。我们通过理论和经验分析观察到节点对全局模型聚集的贡献与局部节点上的数据分布之间的隐式联系。然后，我们建议在每个训练回合中自适应地根据节点贡献来更新全局模型的不同权重。参与节点的贡献首先是通过局部梯度向量和全局梯度向量之间的角度测量的，然后通过随后设计的非线性映射函数来量化权重。简单而有效的策略可以动态增强正面（抑制负）节点的贡献，从而大大减少通信圆形。它优于普遍采用的联邦平均（FedAvg），在理论上和实验上都得到了验证。通过在Pytorch和Pysyft中进行的大量实验，我们表明，与FedAvg算法相比，在MNIST数据集中，使用FedADP的FL培训可以减少MNIST数据集的通信次数最多可减少高达54.1％，而时尚人士数据集的训练可减少高达45.4％的沟通次数，而时尚人士数据集则最高可减少45.4％。

Federated learning (FL) enables resource-constrained edge nodes to collaboratively learn a global model under the orchestration of a central server while keeping privacy-sensitive data locally. The non-independent-and-identically-distributed (non-IID) data samples across participating nodes slow model training and impose additional communication rounds for FL to converge. In this paper, we propose Federated Adaptive Weighting (FedAdp) algorithm that aims to accelerate model convergence under the presence of nodes with non-IID dataset. We observe the implicit connection between the node contribution to the global model aggregation and data distribution on the local node through theoretical and empirical analysis. We then propose to assign different weights for updating the global model based on node contribution adaptively through each training round. The contribution of participating nodes is first measured by the angle between the local gradient vector and the global gradient vector, and then, weight is quantified by a designed non-linear mapping function subsequently. The simple yet effective strategy can reinforce positive (suppress negative) node contribution dynamically, resulting in communication round reduction drastically. Its superiority over the commonly adopted Federated Averaging (FedAvg) is verified both theoretically and experimentally. With extensive experiments performed in Pytorch and PySyft, we show that FL training with FedAdp can reduce the number of communication rounds by up to 54.1% on MNIST dataset and up to 45.4% on FashionMNIST dataset, as compared to FedAvg algorithm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题