论文标题
Gibert:通过轻巧的门注射方法将语言知识引入BERT
GiBERT: Introducing Linguistic Knowledge into BERT through a Lightweight Gated Injection Method
论文作者
论文摘要
诸如BERT之类的大型预训练的语言模型一直是许多NLP任务的最新改进的推动力。但是,伯特只经过训练来预测缺少的单词(无论是在面具还是在下一个句子中),并且不了解词汇,句法或语义信息,而不是通过无监督的预训练来挑选的信息。我们提出了一种新的方法,将单词嵌入形式的语言知识显式注入到预训练的BERT的任何层中。在注入基于依赖关系和反拟合的嵌入时,我们对多个语义相似性数据集的性能改进表明,此类信息是有益的,目前是原始模型中缺少的。我们的定性分析表明,反拟合的嵌入式注射特别有助于涉及同义词对的病例。
Large pre-trained language models such as BERT have been the driving force behind recent improvements across many NLP tasks. However, BERT is only trained to predict missing words - either behind masks or in the next sentence - and has no knowledge of lexical, syntactic or semantic information beyond what it picks up through unsupervised pre-training. We propose a novel method to explicitly inject linguistic knowledge in the form of word embeddings into any layer of a pre-trained BERT. Our performance improvements on multiple semantic similarity datasets when injecting dependency-based and counter-fitted embeddings indicate that such information is beneficial and currently missing from the original model. Our qualitative analysis shows that counter-fitted embedding injection particularly helps with cases involving synonym pairs.