通过自我安装和自我验证来改善BERT微调

论文标题

通过自我安装和自我验证来改善BERT微调

Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation

论文作者

Xu, Yige, Qiu, Xipeng, Zhou, Ligao, Huang, Xuanjing

论文摘要

诸如BERT之类的微调预训练的语言模型已成为NLP的有效方法，并在许多下游任务中产生最先进的结果。关于将BERT适应新任务的最新研究主要集中于修改模型结构，重新设计预训练任务并利用外部数据和知识。微调策略本身尚未得到充分探索。在本文中，我们通过两种有效的机制来改善BERT的微调：自我安装和自我抗议。有关文本分类和自然语言推理任务的实验表明，我们提出的方法可以显着改善BERT的适应，而无需任何外部数据或知识。

Fine-tuning pre-trained language models like BERT has become an effective way in NLP and yields state-of-the-art results on many downstream tasks. Recent studies on adapting BERT to new tasks mainly focus on modifying the model structure, re-designing the pre-train tasks, and leveraging external data and knowledge. The fine-tuning strategy itself has yet to be fully explored. In this paper, we improve the fine-tuning of BERT with two effective mechanisms: self-ensemble and self-distillation. The experiments on text classification and natural language inference tasks show our proposed methods can significantly improve the adaption of BERT without any external data or knowledge.

下载PDF全文

下载文献需遵守相关版权规定

论文标题