论文标题
统一多语言神经机器翻译中的融合
Unifying the Convergences in Multilingual Neural Machine Translation
论文作者
论文摘要
尽管多种语模型的多语言神经机器翻译(多语言NMT)取得了显着的进步,但联合训练中的融合不一致被忽略,即,在不同时期内达到收敛的不同语言对。这导致训练有素的MNMT模型过于合适的低资源语言翻译,同时合格的高资源译本。在本文中,我们提出了一种名为LSSD(特定于语言的自我验证)的新颖培训策略,该策略可以减轻融合不一致的不一致,并帮助MNMT模型同时实现每种语言对的最佳性能。具体而言,LSSD为每个语言对拾取特定语言的最佳检查站,以随时教授当前模型。此外,我们系统地探索了三个知识转移的样本级操作。三个数据集的实验结果表明,LSSD对所有语言对都一致地改进并实现了最先进的方法。
Although all-in-one-model multilingual neural machine translation (multilingual NMT) has achieved remarkable progress, the convergence inconsistency in the joint training is ignored, i.e., different language pairs reaching convergence in different epochs. This leads to the trained MNMT model over-fitting low-resource language translations while under-fitting high-resource ones. In this paper, we propose a novel training strategy named LSSD (Language-Specific Self-Distillation), which can alleviate the convergence inconsistency and help MNMT models achieve the best performance on each language pair simultaneously. Specifically, LSSD picks up language-specific best checkpoints for each language pair to teach the current model on the fly. Furthermore, we systematically explore three sample-level manipulations of knowledge transferring. Experimental results on three datasets show that LSSD obtains consistent improvements towards all language pairs and achieves the state-of-the-art.