跨语性监督改善了无监督的神经机器翻译

论文标题

跨语性监督改善了无监督的神经机器翻译

Cross-lingual Supervision Improves Unsupervised Neural Machine Translation

论文作者

Wang, Mingxuan, Bai, Hongxiao, Zhao, Hai, Li, Lei

论文摘要

神经机器翻译〜（NMT）对零资源语言无效。最近的著作探讨了只有单语数据的无监督神经机器翻译（UMT）的可能性，才能达到有希望的结果。但是，在平行监督下，UNMT和NMT之间仍然存在很大的差距。在这项工作中，我们介绍了一个多语言的无监督NMT（\ Method）框架，以利用高资源语言对的弱监督信号到零资源的翻译方向。更具体地说，对于无监督的语言对\ texttt {en-de}，我们可以充分利用来自Parallel DataSet \ texttt {en-fr}的信息，以共同训练无处理的翻译说明。 \方法基于多语言模型，该模型不需要更改标准的无监督NMT。经验结果表明，\方法在六个基准无监督的翻译方向上显着提高了翻译质量超过3个BLEU得分。

Neural machine translation~(NMT) is ineffective for zero-resource languages. Recent works exploring the possibility of unsupervised neural machine translation (UNMT) with only monolingual data can achieve promising results. However, there are still big gaps between UNMT and NMT with parallel supervision. In this work, we introduce a multilingual unsupervised NMT (\method) framework to leverage weakly supervised signals from high-resource language pairs to zero-resource translation directions. More specifically, for unsupervised language pairs \texttt{En-De}, we can make full use of the information from parallel dataset \texttt{En-Fr} to jointly train the unsupervised translation directions all in one model. \method is based on multilingual models which require no changes to the standard unsupervised NMT. Empirical results demonstrate that \method significantly improves the translation quality by more than 3 BLEU score on six benchmark unsupervised translation directions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题