论文标题

富含阶列线性化,以更快的序列到序列成分解析

Enriched In-Order Linearization for Faster Sequence-to-Sequence Constituent Parsing

论文作者

Fernández-González, Daniel, Gómez-Rodríguez, Carlos

论文摘要

序列到序列成分解析需要线性化以表示树作为序列。自上而下的树线性化可以基于括号或换挡动作,但已达到迄今为止的最佳精度。在本文中,我们表明可以通过使用订单线性化改善这些结果。基于这一观察结果,我们实施了受Vinyals等人启发的富集阶段减少线性化。 (2015年)的方法,在完全监督的单模单建模序列与序列组成部分中,在英语PTB数据集上实现了迄今为止的最佳准确性。最后,我们采用确定的注意机制来匹配基于最新的过渡的解析器的速度,因此表明序列到序列模型不仅可以与它们的准确性,而且可以匹配速度。

Sequence-to-sequence constituent parsing requires a linearization to represent trees as sequences. Top-down tree linearizations, which can be based on brackets or shift-reduce actions, have achieved the best accuracy to date. In this paper, we show that these results can be improved by using an in-order linearization instead. Based on this observation, we implement an enriched in-order shift-reduce linearization inspired by Vinyals et al. (2015)'s approach, achieving the best accuracy to date on the English PTB dataset among fully-supervised single-model sequence-to-sequence constituent parsers. Finally, we apply deterministic attention mechanisms to match the speed of state-of-the-art transition-based parsers, thus showing that sequence-to-sequence models can match them, not only in accuracy, but also in speed.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源