用于联合学习的元知识凝结

论文标题

用于联合学习的元知识凝结

Meta Knowledge Condensation for Federated Learning

论文作者

Liu, Ping, Yu, Xin, Zhou, Joey Tianyi

论文摘要

现有的联合学习范式通常在中央求解器上广泛交换分布式模型，以实现更强大的模型。但是，这将导致服务器与多个客户之间的严重沟通负担，尤其是在数据分布是异质的情况下。结果，当前的联合学习方法通常需要大量的培训沟通回合。与现有范式不同，我们引入了另一种观点，以大大降低联邦学习的沟通成本。在这项工作中，我们首先引入了一种元知识表示方法，该方法从分布式客户端提取元知识。提取的元知识编码可用于改善当前模型的基本信息。随着培训的进行，培训样品对联合模型的贡献也有所不同。因此，我们引入了一种动态的重量分配机制，该机制使样本能够适应当前模型更新。然后，将所有活动客户端的信息元知识发送到服务器以进行模型更新。培训模型在合并的元知识上，而不揭示不同客户之间的原始数据可以大大减轻异质性问题。此外，为了进一步改善数据异质性，我们还将元知识交换为客户之间的有条件初始化，用于局部元知识提取。广泛的实验证明了我们提出的方法的有效性和效率。值得注意的是，我们的方法在MNIST上以限制性通信预算（即10轮）的价格优于最先进的利润率（从$ 74.07 \％\％\％$到$ 92.95 \％$）。

Existing federated learning paradigms usually extensively exchange distributed models at a central solver to achieve a more powerful model. However, this would incur severe communication burden between a server and multiple clients especially when data distributions are heterogeneous. As a result, current federated learning methods often require a large number of communication rounds in training. Unlike existing paradigms, we introduce an alternative perspective to significantly decrease the communication cost in federate learning. In this work, we first introduce a meta knowledge representation method that extracts meta knowledge from distributed clients. The extracted meta knowledge encodes essential information that can be used to improve the current model. As the training progresses, the contributions of training samples to a federated model also vary. Thus, we introduce a dynamic weight assignment mechanism that enables samples to contribute adaptively to the current model update. Then, informative meta knowledge from all active clients is sent to the server for model update. Training a model on the combined meta knowledge without exposing original data among different clients can significantly mitigate the heterogeneity issues. Moreover, to further ameliorate data heterogeneity, we also exchange meta knowledge among clients as conditional initialization for local meta knowledge extraction. Extensive experiments demonstrate the effectiveness and efficiency of our proposed method. Remarkably, our method outperforms the state-of-the-art by a large margin (from $74.07\%$ to $92.95\%$) on MNIST with a restricted communication budget (i.e. 10 rounds).

下载PDF全文

下载文献需遵守相关版权规定

论文标题