通过自我监督培训来改善内在的几次学习

论文标题

通过自我监督培训来改善内在的几次学习

Improving In-Context Few-Shot Learning via Self-Supervised Training

论文作者

Chen, Mingda, Du, Jingfei, Pasunuru, Ramakanth, Mihaylov, Todor, Iyer, Srini, Stoyanov, Veselin, Kozareva, Zornitsa

论文摘要

自我监督的预处理使许多NLP任务的学习可能很少。但是，预训练的目标通常不是专门针对内在的几局学习。在本文中，我们建议在预训练和下游几次使用之间使用自学训练阶段，以教导该模型在上下文中执行几次射击学习。我们在两个基准上提出并评估了四个自我监督的目标。我们发现中级自主阶段会产生胜过强大基线的模型。消融研究表明，几个因素会影响下游性能，例如训练数据的量和自我监管目标的多样性。人类宣布的跨任务监督和自我判断是互补的。定性分析表明，自我监管训练的模型在以下任务要求方面更好。

Self-supervised pretraining has made few-shot learning possible for many NLP tasks. But the pretraining objectives are not typically adapted specifically for in-context few-shot learning. In this paper, we propose to use self-supervision in an intermediate training stage between pretraining and downstream few-shot usage with the goal to teach the model to perform in-context few shot learning. We propose and evaluate four self-supervised objectives on two benchmarks. We find that the intermediate self-supervision stage produces models that outperform strong baselines. Ablation study shows that several factors affect the downstream performance, such as the amount of training data and the diversity of the self-supervised objectives. Human-annotated cross-task supervision and self-supervision are complementary. Qualitative analysis suggests that the self-supervised-trained models are better at following task requirements.

下载PDF全文

下载文献需遵守相关版权规定

论文标题