图序列神经机器翻译

论文标题

图序列神经机器翻译

Graph-to-Sequence Neural Machine Translation

论文作者

Duan, Sufeng, Zhao, Hai, Wang, Rui

论文摘要

神经机器翻译（NMT）通常通过将源或目标句子视为单词的线性序列，以SEQ2SEQ学习方式工作，可以将其视为图形的特殊情况，以序列中的单词为节点和单词边缘之间的关系。鉴于当前的NMT模型或多或少以潜在的方式捕获了序列之间的图形信息，我们提出了一个图形到序列模型，促进了促进显式图形信息捕获。从详细的角度来看，我们通过捕获每个层中不同订单的子图的信息，提出了一个基于图的NMT模型，称为Graph-transformer。子图根据其命令放入不同的组中，并且每组子图分别反映了单词之间的不同依赖性。对于融合子图表示，我们从经验上探索了三种方法，这些方法会加重不同顺序的子图的不同群体。 WMT14英语 - 德国和IWSLT14的实验结果表明，我们的方法可以有效地增强变压器，并在WMT14英国 - 德国数据集上提高1.1 BLEU点，而IWSLT14德国英语数据集则可以在WMT14英国 - 德国数据集中和1.0 BLEU点。

Neural machine translation (NMT) usually works in a seq2seq learning way by viewing either source or target sentence as a linear sequence of words, which can be regarded as a special case of graph, taking words in the sequence as nodes and relationships between words as edges. In the light of the current NMT models more or less capture graph information among the sequence in a latent way, we present a graph-to-sequence model facilitating explicit graph information capturing. In detail, we propose a graph-based SAN-based NMT model called Graph-Transformer by capturing information of subgraphs of different orders in every layers. Subgraphs are put into different groups according to their orders, and every group of subgraphs respectively reflect different levels of dependency between words. For fusing subgraph representations, we empirically explore three methods which weight different groups of subgraphs of different orders. Results of experiments on WMT14 English-German and IWSLT14 German-English show that our method can effectively boost the Transformer with an improvement of 1.1 BLEU points on WMT14 English-German dataset and 1.0 BLEU points on IWSLT14 German-English dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题