通过自我训练在句子级的神经机器翻译模型中捕获文档上下文

论文标题

通过自我训练在句子级的神经机器翻译模型中捕获文档上下文

Capturing document context inside sentence-level neural machine translation models with self-training

论文作者

Mansimov, Elman, Melis, Gábor, Yu, Lei

论文摘要

在句子级别进行训练和评估时，神经机器翻译（NMT）可以说已经实现了人类水平的奇偶校验。文档级的神经机器翻译受到了较少的关注，并且在其句子级级别的对应物背后滞后。大多数提出的文档级方法调查了将模型调节的方法，以捕获几个源或目标句子以捕获文档上下文。这些方法需要在并行文档级别的Corpora上从头开始培训专门的NMT模型。我们提出了一种方法，该方法不需要在平行文档级别的COLPORA上培训专门模型，并在解码时应用于训练有素的句子级别NMT模型。我们多次从左到右处理文档，并在源句子和生成的翻译成对上自我训练句子级模型。我们的方法加强了该模型的选择，因此更有可能在文档中的其他句子中做出相同的选择。我们在三个文档级数据集上评估了我们的方法：NIST中文，WMT'19中文英语和OpenSubtitles English-Russian。我们证明我们的方法比基线具有更高的BLEU评分和更高的人类偏好。对我们方法的定性分析表明，模型做出的选择在整个文档中是一致的。

Neural machine translation (NMT) has arguably achieved human level parity when trained and evaluated at the sentence-level. Document-level neural machine translation has received less attention and lags behind its sentence-level counterpart. The majority of the proposed document-level approaches investigate ways of conditioning the model on several source or target sentences to capture document context. These approaches require training a specialized NMT model from scratch on parallel document-level corpora. We propose an approach that doesn't require training a specialized model on parallel document-level corpora and is applied to a trained sentence-level NMT model at decoding time. We process the document from left to right multiple times and self-train the sentence-level model on pairs of source sentences and generated translations. Our approach reinforces the choices made by the model, thus making it more likely that the same choices will be made in other sentences in the document. We evaluate our approach on three document-level datasets: NIST Chinese-English, WMT'19 Chinese-English and OpenSubtitles English-Russian. We demonstrate that our approach has higher BLEU score and higher human preference than the baseline. Qualitative analysis of our approach shows that choices made by model are consistent across the document.

下载PDF全文

下载文献需遵守相关版权规定

论文标题