论文标题
通过嘈杂的下行链路的联合学习的融合
Convergence of Federated Learning over a Noisy Downlink
论文作者
论文摘要
我们研究联合学习(FL),其中功能有限的无线设备利用其本地数据集在远程参数服务器(PS)的帮助下协作训练全局模型。 PS可以访问全球模型,并与本地培训的设备共享,并且设备将本地更新的结果返回到PS以更新全局模型。该框架需要从PS到设备的下行链路传输,并从设备到PS的上行链路传输。这项研究的目的是研究带宽有限的共享无线介质在下行链路和上行链路上对FL的性能的影响,重点是下行链路。为此,下行链路和上行链路通道分别以有限的带宽为褪色的广播和多个访问频道。对于下行链路传输,我们首先引入了一种数字方法,该方法在PS上采用了量化技术,以通用速率广播全球模型更新,以便所有设备都可以解码它。接下来,我们提出模拟下行链路传输,其中PS以未编码的方式广播了全局模型。在这两种情况下,我们都考虑在上行链路上进行模拟传输。假设上行链路传输无错误,我们进一步分析了提出的模拟方法的收敛行为。数值实验表明,尽管PS处的传输功率明显降低,但模拟下行链路方法对数字的实验可显着改善。实验结果证实了收敛的结果,并表明当数据分布更加偏见时,应使用较少的局部迭代,并且当设备在模拟下行链路方法中对全局模型的估计更好。
We study federated learning (FL), where power-limited wireless devices utilize their local datasets to collaboratively train a global model with the help of a remote parameter server (PS). The PS has access to the global model and shares it with the devices for local training, and the devices return the result of their local updates to the PS to update the global model. This framework requires downlink transmission from the PS to the devices and uplink transmission from the devices to the PS. The goal of this study is to investigate the impact of the bandwidth-limited shared wireless medium in both the downlink and uplink on the performance of FL with a focus on the downlink. To this end, the downlink and uplink channels are modeled as fading broadcast and multiple access channels, respectively, both with limited bandwidth. For downlink transmission, we first introduce a digital approach, where a quantization technique is employed at the PS to broadcast the global model update at a common rate such that all the devices can decode it. Next, we propose analog downlink transmission, where the global model is broadcast by the PS in an uncoded manner. We consider analog transmission over the uplink in both cases. We further analyze the convergence behavior of the proposed analog approach assuming that the uplink transmission is error-free. Numerical experiments show that the analog downlink approach provides significant improvement over the digital one, despite a significantly lower transmit power at the PS. The experimental results corroborate the convergence results, and show that a smaller number of local iterations should be used when the data distribution is more biased, and also when the devices have a better estimate of the global model in the analog downlink approach.