非自动回归机器翻译带有潜在对齐

论文标题

非自动回归机器翻译带有潜在对齐

Non-Autoregressive Machine Translation with Latent Alignments

论文作者

Saharia, Chitwan, Chan, William, Saxena, Saurabh, Norouzi, Mohammad

论文摘要

本文介绍了两种强大的方法，即CTC和螺丝码，用于非自动回旋的机器翻译，它们用动态编程对潜在对准进行了模拟。我们对机器翻译进行了重新访问CTC，并证明了一个简单的CTC模型可以实现单步非自动回调机器翻译的最新模型，这与先前的工作指示相反。此外，我们适应了非自动回归机器翻译的螺旋模型，并证明只有4代步骤的螺母可以与自回归变压器基线的性能相匹配。我们的潜在对齐模型比许多现有的非自动回旋翻译基线要简单。例如，我们不需要具有自回归模型的目标长度预测或重新评分。在竞争性的WMT'14 en $ \ rightarrow $ de任务上，我们的CTC模型以单代步骤实现25.7 BLEU，而螺丝驾驶器则以2代步骤实现27.5 BLEU，而28.0 BLEU则以4代步骤实现。这与27.8 BLEU的自回旋变压器基线相比有利。

This paper presents two strong methods, CTC and Imputer, for non-autoregressive machine translation that model latent alignments with dynamic programming. We revisit CTC for machine translation and demonstrate that a simple CTC model can achieve state-of-the-art for single-step non-autoregressive machine translation, contrary to what prior work indicates. In addition, we adapt the Imputer model for non-autoregressive machine translation and demonstrate that Imputer with just 4 generation steps can match the performance of an autoregressive Transformer baseline. Our latent alignment models are simpler than many existing non-autoregressive translation baselines; for example, we do not require target length prediction or re-scoring with an autoregressive model. On the competitive WMT'14 En$\rightarrow$De task, our CTC model achieves 25.7 BLEU with a single generation step, while Imputer achieves 27.5 BLEU with 2 generation steps, and 28.0 BLEU with 4 generation steps. This compares favourably to the autoregressive Transformer baseline at 27.8 BLEU.

下载PDF全文

下载文献需遵守相关版权规定

论文标题