伊莱克特拉也是一个零球学习者

论文标题

伊莱克特拉也是一个零球学习者

ELECTRA is a Zero-Shot Learner, Too

论文作者

Ni, Shiwen, Kao, Hung-Yu

论文摘要

最近，与“预训练，及时和预测”的新范式相比，与“预训练，微调”范式相比，新的范式“迅速训练，及时和预测”取得了显着的成就。在基于及时的GPT-3成功之后，一系列基于蒙版的语言模型（MLM）（例如Bert，Roberta）及时学习方法变得流行并广泛使用。但是，另一个有效的预训练的判别模型Electra可能被忽略了。在本文中，我们尝试使用拟议的小说替换了代币检测（RTD）基于基于的及时学习方法来完成零摄像的几个NLP任务。实验结果表明，基于RTD-Prompt学习的Electra模型可达到令人惊讶的最先进的零击性能。在数字上，与MLM-Roberta-Large和MLM-Bert-Large相比，我们的RTD-Electra-Large在所有15个任务上平均提高了约8.4％和13.7％。尤其是在SST-2任务上，我们的RTD - 电子大赛在没有任何培训数据的情况下实现了令人惊讶的90.1％精度。总体而言，与预先训练的蒙版语言模型相比，预先训练的代替令牌检测模型在零拍学习中的表现更好。源代码可在以下网址获得：https：//github.com/nishiwen1214/rtd-electra。

Recently, for few-shot or even zero-shot learning, the new paradigm "pre-train, prompt, and predict" has achieved remarkable achievements compared with the "pre-train, fine-tune" paradigm. After the success of prompt-based GPT-3, a series of masked language model (MLM)-based (e.g., BERT, RoBERTa) prompt learning methods became popular and widely used. However, another efficient pre-trained discriminative model, ELECTRA, has probably been neglected. In this paper, we attempt to accomplish several NLP tasks in the zero-shot scenario using a novel our proposed replaced token detection (RTD)-based prompt learning method. Experimental results show that ELECTRA model based on RTD-prompt learning achieves surprisingly state-of-the-art zero-shot performance. Numerically, compared to MLM-RoBERTa-large and MLM-BERT-large, our RTD-ELECTRA-large has an average of about 8.4% and 13.7% improvement on all 15 tasks. Especially on the SST-2 task, our RTD-ELECTRA-large achieves an astonishing 90.1% accuracy without any training data. Overall, compared to the pre-trained masked language models, the pre-trained replaced token detection model performs better in zero-shot learning. The source code is available at: https://github.com/nishiwen1214/RTD-ELECTRA.

下载PDF全文

下载文献需遵守相关版权规定

论文标题