论文标题
富含阶列线性化,以更快的序列到序列成分解析
Enriched In-Order Linearization for Faster Sequence-to-Sequence Constituent Parsing
论文作者
论文摘要
序列到序列成分解析需要线性化以表示树作为序列。自上而下的树线性化可以基于括号或换挡动作,但已达到迄今为止的最佳精度。在本文中,我们表明可以通过使用订单线性化改善这些结果。基于这一观察结果,我们实施了受Vinyals等人启发的富集阶段减少线性化。 (2015年)的方法,在完全监督的单模单建模序列与序列组成部分中,在英语PTB数据集上实现了迄今为止的最佳准确性。最后,我们采用确定的注意机制来匹配基于最新的过渡的解析器的速度,因此表明序列到序列模型不仅可以与它们的准确性,而且可以匹配速度。
Sequence-to-sequence constituent parsing requires a linearization to represent trees as sequences. Top-down tree linearizations, which can be based on brackets or shift-reduce actions, have achieved the best accuracy to date. In this paper, we show that these results can be improved by using an in-order linearization instead. Based on this observation, we implement an enriched in-order shift-reduce linearization inspired by Vinyals et al. (2015)'s approach, achieving the best accuracy to date on the English PTB dataset among fully-supervised single-model sequence-to-sequence constituent parsers. Finally, we apply deterministic attention mechanisms to match the speed of state-of-the-art transition-based parsers, thus showing that sequence-to-sequence models can match them, not only in accuracy, but also in speed.