富含阶列线性化，以更快的序列到序列成分解析

论文标题

富含阶列线性化，以更快的序列到序列成分解析

Enriched In-Order Linearization for Faster Sequence-to-Sequence Constituent Parsing

论文作者

Fernández-González, Daniel, Gómez-Rodríguez, Carlos

论文摘要

序列到序列成分解析需要线性化以表示树作为序列。自上而下的树线性化可以基于括号或换挡动作，但已达到迄今为止的最佳精度。在本文中，我们表明可以通过使用订单线性化改善这些结果。基于这一观察结果，我们实施了受Vinyals等人启发的富集阶段减少线性化。（2015年）的方法，在完全监督的单模单建模序列与序列组成部分中，在英语PTB数据集上实现了迄今为止的最佳准确性。最后，我们采用确定的注意机制来匹配基于最新的过渡的解析器的速度，因此表明序列到序列模型不仅可以与它们的准确性，而且可以匹配速度。

Sequence-to-sequence constituent parsing requires a linearization to represent trees as sequences. Top-down tree linearizations, which can be based on brackets or shift-reduce actions, have achieved the best accuracy to date. In this paper, we show that these results can be improved by using an in-order linearization instead. Based on this observation, we implement an enriched in-order shift-reduce linearization inspired by Vinyals et al. (2015)'s approach, achieving the best accuracy to date on the English PTB dataset among fully-supervised single-model sequence-to-sequence constituent parsers. Finally, we apply deterministic attention mechanisms to match the speed of state-of-the-art transition-based parsers, thus showing that sequence-to-sequence models can match them, not only in accuracy, but also in speed.

下载PDF全文

下载文献需遵守相关版权规定

论文标题