论文标题

Microbert:通过降低参数和多任务学习对低资源单语言的有效培训

MicroBERT: Effective Training of Low-resource Monolingual BERTs through Parameter Reduction and Multitask Learning

论文作者

Gessler, Luke, Zeldes, Amir

论文摘要

变压器语言模型(TLMS)对于大多数NLP任务至关重要,但是由于所需的预处理数据,因此很难为低资源语言创建它们。在这项工作中,我们研究了在低资源环境中培训单语TLM的两种技术:大大降低了TLM的大小,并使用两个语言上丰富的监督任务(一部分语音标记和依赖性解析)补充了蒙面的语言建模目标。 7种不同语言的结果表明,我们的Microbert能够相对于典型的单语TLM预处理方法,在下游任务评估中产生明显的改进。具体而言,我们发现与多语言基线Mbert相比,单语Microbert模型可为Parser LAS获得高达18%的收益,而NER F1的收益率为11%,而NER F1的收益率为11%。我们得出结论,减少了TLM参数计数,并使用标记的数据进行预处理的低资源TLM可以产生巨大的质量好处,在某些情况下,产生了超过多语言方法的模型。

Transformer language models (TLMs) are critical for most NLP tasks, but they are difficult to create for low-resource languages because of how much pretraining data they require. In this work, we investigate two techniques for training monolingual TLMs in a low-resource setting: greatly reducing TLM size, and complementing the masked language modeling objective with two linguistically rich supervised tasks (part-of-speech tagging and dependency parsing). Results from 7 diverse languages indicate that our model, MicroBERT, is able to produce marked improvements in downstream task evaluations relative to a typical monolingual TLM pretraining approach. Specifically, we find that monolingual MicroBERT models achieve gains of up to 18% for parser LAS and 11% for NER F1 compared to a multilingual baseline, mBERT, while having less than 1% of its parameter count. We conclude reducing TLM parameter count and using labeled data for pretraining low-resource TLMs can yield large quality benefits and in some cases produce models that outperform multilingual approaches.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源