论文标题
Grad2Task:使用渐变进行任务表示,改进了几片文本分类
Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation
论文作者
论文摘要
诸如BERT之类的大型语言模型(LMS)在许多不同的自然语言处理(NLP)任务中的性能提高了。但是,这样的模型需要为每个目标任务进行大量培训示例。同时,许多现实的NLP问题是“很少的射击”,没有足够大的训练。在这项工作中,我们提出了一种基于有条件的神经过程的新型方法,用于几个弹头文本分类,该方法学会从其他不同的任务中转移到其他不同的任务中。我们的关键想法是使用基本模型中的梯度信息表示每个任务,并训练适应在任务表示条件下的文本分类器的适应网络。虽然以前的任务意识到的少数学习者通过输入编码表示任务,但我们的新型任务表示功能更强大,因为梯度捕获了任务的输入输出关系。实验结果表明,我们的方法优于传统的微调,顺序转移学习和最先进的元学习方法,这些方法在各种少量任务的集合中。我们进一步进行了分析和消融,以证明我们的设计选择是合理的。
Large pretrained language models (LMs) like BERT have improved performance in many disparate natural language processing (NLP) tasks. However, fine tuning such models requires a large number of training examples for each target task. Simultaneously, many realistic NLP problems are "few shot", without a sufficiently large training set. In this work, we propose a novel conditional neural process-based approach for few-shot text classification that learns to transfer from other diverse tasks with rich annotation. Our key idea is to represent each task using gradient information from a base model and to train an adaptation network that modulates a text classifier conditioned on the task representation. While previous task-aware few-shot learners represent tasks by input encoding, our novel task representation is more powerful, as the gradient captures input-output relationships of a task. Experimental results show that our approach outperforms traditional fine-tuning, sequential transfer learning, and state-of-the-art meta learning approaches on a collection of diverse few-shot tasks. We further conducted analysis and ablations to justify our design choices.