通过最近的邻居校准改善语言模型的几乎没有射击性能

论文标题

通过最近的邻居校准改善语言模型的几乎没有射击性能

Improving Few-Shot Performance of Language Models via Nearest Neighbor Calibration

论文作者

Nie, Feng, Chen, Meixi, Zhang, Zhirui, Cheng, Xu

论文摘要

预先训练的语言模型（PLM）在自然语言提示中提供了一些示例作为测试实例的演示，即在文章中的学习时，表现出显着的几次学习能力。但是，内部文化学习的表现容易受到及时格式，培训示例和培训示例的顺序的选择。在本文中，我们提出了一个新颖的邻居校准框架，以学习缓解此问题。这是受到一种现象的启发，即在推断训练实例时会产生不正确的标签，该现象提供了有用的监督信号来校准预测。因此，我们的方法直接通过PLMS及其相应标签获得的缓存的几弹性实例表示的数据存储仪直接通过$ k $ -neart-neart-neymenem-neart-neighbor（$ k $ nn）分类器增强了预测。然后引入自适应邻居的选择和功能正则化模块，以充分利用一些支持实例，以减少$ k $ nn的检索噪声。各种少数文本分类任务的实验表明，我们的方法显着改善了文本的学习，同时甚至在某些情感分析任务中使用基于最新的调谐方法实现了可比性的性能。

Pre-trained language models (PLMs) have exhibited remarkable few-shot learning capabilities when provided a few examples in a natural language prompt as demonstrations of test instances, i.e., in-context learning. However, the performance of in-context learning is susceptible to the choice of prompt format, training examples and the ordering of the training examples. In this paper, we propose a novel nearest-neighbor calibration framework for in-context learning to ease this issue. It is inspired by a phenomenon that the in-context learning paradigm produces incorrect labels when inferring training instances, which provides a useful supervised signal to calibrate predictions. Thus, our method directly augments the predictions with a $k$-nearest-neighbor ($k$NN) classifier over a datastore of cached few-shot instance representations obtained by PLMs and their corresponding labels. Then adaptive neighbor selection and feature regularization modules are introduced to make full use of a few support instances to reduce the $k$NN retrieval noise. Experiments on various few-shot text classification tasks demonstrate that our method significantly improves in-context learning, while even achieving comparable performance with state-of-the-art tuning-based approaches in some sentiment analysis tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题