论文标题
基于多臂强盗的客户计划用于联合学习
Multi-Armed Bandit Based Client Scheduling for Federated Learning
论文作者
论文摘要
通过利用分布式客户端的计算能力和本地数据,联合学习(FL)具有无处不在的属性,例如减少通信开销和保留数据隐私。在FL的每个通信回合中,客户都会根据自己的数据更新本地模型,并通过无线渠道上传本地更新。但是,佛罗里达州数百至数千次通讯巡回赛造成的潜伏期仍然是一个瓶颈。为了最大程度地减少培训潜伏期,这项工作为FL中的在线客户调度(CS)提供了一个基于多臂强盗的框架,而无需了解无线渠道状态信息和客户端的统计特征。首先,我们根据理想场景提出了基于上限置信策略(CS-UCB)的CS算法,其中客户的本地数据集是独立且分布相同(I.I.D.)并平衡的。提供了建议的CS-UCB算法的预期性能遗憾的上限,这表明遗憾会在交流回合中对数增长。然后,使用非i.i.d解决非理想的方案。以及本地数据集的不平衡属性以及客户的可用性,我们进一步基于UCB策略和虚拟队列技术(CS-UCB-Q)提出了CS算法。还会得出上限,这表明拟议的CS-UCB-Q算法的预期性能遗憾可以在某些条件下在通信弹药中具有次线性增长。此外,还分析了FL培训的收敛性能。最后,仿真结果验证了所提出的算法的效率。
By exploiting the computing power and local data of distributed clients, federated learning (FL) features ubiquitous properties such as reduction of communication overhead and preserving data privacy. In each communication round of FL, the clients update local models based on their own data and upload their local updates via wireless channels. However, latency caused by hundreds to thousands of communication rounds remains a bottleneck in FL. To minimize the training latency, this work provides a multi-armed bandit-based framework for online client scheduling (CS) in FL without knowing wireless channel state information and statistical characteristics of clients. Firstly, we propose a CS algorithm based on the upper confidence bound policy (CS-UCB) for ideal scenarios where local datasets of clients are independent and identically distributed (i.i.d.) and balanced. An upper bound of the expected performance regret of the proposed CS-UCB algorithm is provided, which indicates that the regret grows logarithmically over communication rounds. Then, to address non-ideal scenarios with non-i.i.d. and unbalanced properties of local datasets and varying availability of clients, we further propose a CS algorithm based on the UCB policy and virtual queue technique (CS-UCB-Q). An upper bound is also derived, which shows that the expected performance regret of the proposed CS-UCB-Q algorithm can have a sub-linear growth over communication rounds under certain conditions. Besides, the convergence performance of FL training is also analyzed. Finally, simulation results validate the efficiency of the proposed algorithms.