论文标题

适配器:关于变压器中适配器的效率

AdapterDrop: On the Efficiency of Adapters in Transformers

论文作者

Rücklé, Andreas, Geigle, Gregor, Glockner, Max, Beck, Tilman, Pfeiffer, Jonas, Reimers, Nils, Gurevych, Iryna

论文摘要

大规模预训练的变压器模型在计算上对微调昂贵,推理缓慢,并且具有较大的存储要求。最近的方法通过训练较小的模型,动态降低模型大小以及训练轻量级适配器来解决这些缺陷。在本文中,我们提出了适配器,在训练和推理过程中删除了较低变压器层的适配器,这结合了所有三个方向的概念。我们表明,在同时对多个任务进行推断时,AdapterDrop可以动态减少计算开销,而任务性能最小。我们从适应性融合中进一步修剪适配器,该适配器提高了推理效率,同时完全保持了任务性能。

Massively pre-trained transformer models are computationally expensive to fine-tune, slow for inference, and have large storage requirements. Recent approaches tackle these shortcomings by training smaller models, dynamically reducing the model size, and by training light-weight adapters. In this paper, we propose AdapterDrop, removing adapters from lower transformer layers during training and inference, which incorporates concepts from all three directions. We show that AdapterDrop can dynamically reduce the computational overhead when performing inference over multiple tasks simultaneously, with minimal decrease in task performances. We further prune adapters from AdapterFusion, which improves the inference efficiency while maintaining the task performances entirely.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源