论文标题
通过本地改编来挽救联邦学习
Salvaging Federated Learning by Local Adaptation
论文作者
论文摘要
Federated Learning(FL)是一种大量促进的方法,用于培训有关敏感数据的ML模型,例如用户在智能手机上键入的文本。 FL的设计明确设计用于培训参与者不平衡和非IID的数据。为了确保联邦模型的隐私和完整性,最新的FL方法使用差异隐私或稳健的聚合。 我们从个人参与者的\ emph {local}的观点来看FL,并询问:(1)参与者是否有动力参加FL? (2)参与者如何在不重新设计FL框架和/或参与其他参与者的情况下提高本地模型的质量? 首先,我们表明,在诸如Next-Word预测之类的标准任务上,许多参与者没有从FL中获得任何好处,因为联合模型在数据上的准确性较差,而不是可以自己在本地训练的模型。其次,我们表明,通过进一步破坏许多参与者的联合模型的准确性,差异隐私和强大的聚合使这个问题变得更糟。 然后,我们评估了三种技术,用于联合模型的局部适应:微调,多任务学习和知识蒸馏。我们分析每个人都适用的地方,并证明所有参与者都从本地适应中受益。当地模型较差的参与者比常规FL获得了很大的准确性改进。本地模型比联合模型\ textemdash更好并且没有动机参加FL的参与者\ Textemdash的改善较少,但足以使改装的联合模型比本地模型更好。
Federated learning (FL) is a heavily promoted approach for training ML models on sensitive data, e.g., text typed by users on their smartphones. FL is expressly designed for training on data that are unbalanced and non-iid across the participants. To ensure privacy and integrity of the fedeated model, latest FL approaches use differential privacy or robust aggregation. We look at FL from the \emph{local} viewpoint of an individual participant and ask: (1) do participants have an incentive to participate in FL? (2) how can participants \emph{individually} improve the quality of their local models, without re-designing the FL framework and/or involving other participants? First, we show that on standard tasks such as next-word prediction, many participants gain no benefit from FL because the federated model is less accurate on their data than the models they can train locally on their own. Second, we show that differential privacy and robust aggregation make this problem worse by further destroying the accuracy of the federated model for many participants. Then, we evaluate three techniques for local adaptation of federated models: fine-tuning, multi-task learning, and knowledge distillation. We analyze where each is applicable and demonstrate that all participants benefit from local adaptation. Participants whose local models are poor obtain big accuracy improvements over conventional FL. Participants whose local models are better than the federated model\textemdash and who have no incentive to participate in FL today\textemdash improve less, but sufficiently to make the adapted federated model better than their local models.