轻量级：跨多个NLP任务的联合学习框架很少

论文标题

轻量级：跨多个NLP任务的联合学习框架很少

FewFedWeight: Few-shot Federated Learning Framework across Multiple NLP Tasks

论文作者

Dong, Weilong, Wu, Xinwei, Li, Junzhuo, Wu, Shuangzhi, Bian, Chao, Xiong, Deyi

论文摘要

通过大型语言模型进行了大量的多任务学习，最近在很少的概括方面取得了长足的进步。但是，这通常是以集中学习方式执行的，忽略了多个任务中使用的（注释）数据的隐私敏感性问题。为了减轻这个问题，我们提出了跨多个任务的少量轻量级，几次联合学习框架，以实现两全其美：隐私保护和交叉任务概括。很少有轻量级的训练客户模型在隔离设备中，而无需共享数据。它将服务器中的全局模型广播到每个客户端，并为客户端生成伪数据，以便可以探索来自全局模型的知识，以增强每个客户端模型的少量学习。进一步提出了一种基于能量的算法来对伪样品进行权重，以减少从产生的伪数据中噪声的负面影响。客户模型的自适应模型权重也根据其性能调整。我们使用这些模型权重动态汇总客户端模型来更新全局模型。在118个NLP任务上的实验表明，很少有轻量级可以显着提高客户模型在61％任务上的性能，而基线的平均绩效提高率为30.5％，并且表现优于FedAvg和其他分散的学习方法。

Massively multi-task learning with large language models has recently made substantial progress on few-shot generalization. However, this is usually performed in a centralized learning fashion, ignoring the privacy sensitivity issue of (annotated) data used in multiple tasks. To mitigate this issue, we propose FewFedWeight, a few-shot federated learning framework across multiple tasks, to achieve the best of both worlds: privacy preservation and cross-task generalization. FewFedWeight trains client models in isolated devices without sharing data. It broadcasts the global model in the server to each client and produces pseudo data for clients so that knowledge from the global model can be explored to enhance few-shot learning of each client model. An energy-based algorithm is further proposed to weight pseudo samples in order to reduce the negative impact of noise from the generated pseudo data. Adaptive model weights of client models are also tuned according to their performance. We use these model weights to dynamically aggregate client models to update the global model. Experiments on 118 NLP tasks show that FewFedWeight can significantly improve the performance of client models on 61% tasks with an average performance improvement rate of 30.5% over the baseline and substantially outperform FedAvg and other decentralized learning methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题