论文标题
推进多语言预训练:TRIP三角文档级级预训练多语言模型
Advancing Multilingual Pre-training: TRIP Triangular Document-level Pre-training for Multilingual Language Models
论文作者
论文摘要
尽管多语言序列到序列的预训练成功,但大多数现有的方法都依赖于文档级单语语言中的多种语言,句子级双语corpora,\ footNote {在本文中,我们使用“双语corpora”来代表与“双语翻译” pairs的含义不同的语言的平行语言,每种语言的含义是两个不同的语言。我们使用``triandual coldera''用许多不同的语言组合用“三语翻译对”表示平行语料库,每个语言组合由三个句子/文档组成。},有时是合成文档级双语corpora。这通过跨语言文档级任务(例如文档级翻译)缩减了性能。因此,我们建议挖掘和利用文档级的三语平行语料库,以改善序列到序列的多语言预训练。我们提出\ textbf {tri}角文档级\ textbf {p}重新训练(\ textbf {trip}),这是场上第一个以一种称为植物的新颖方法将传统单语和双语目标加速为三语言目标的传统单语和双语目标。实验表明,Trip在三个多语言文档级的机器翻译基准和一个跨语性的抽象摘要基准上获得了几个强大的最先进(SOTA)分数,包括一致的提高3.11 d-Bleu点和8.9 Rouge-l点。
Despite the success of multilingual sequence-to-sequence pre-training, most existing approaches rely on document-level monolingual corpora in many different languages, sentence-level bilingual corpora,\footnote{In this paper, we use `bilingual corpora' to denote parallel corpora with `bilingual translation pairs' in many different language pairs, each consisting of two sentences/documents with the same meaning written in different languages. We use `trilingual corpora' to denote parallel corpora with `trilingual translation pairs' in many different language combinations, each consisting of three sentences/documents.} and sometimes synthetic document-level bilingual corpora. This hampers the performance with cross-lingual document-level tasks such as document-level translation. Therefore, we propose to mine and leverage document-level trilingual parallel corpora to improve sequence-to-sequence multilingual pre-training. We present \textbf{Tri}angular Document-level \textbf{P}re-training (\textbf{TRIP}), which is the first in the field to accelerate the conventional monolingual and bilingual objectives into a trilingual objective with a novel method called Grafting. Experiments show that TRIP achieves several strong state-of-the-art (SOTA) scores on three multilingual document-level machine translation benchmarks and one cross-lingual abstractive summarization benchmark, including consistent improvements by up to 3.11 d-BLEU points and 8.9 ROUGE-L points.