解析作为预训练

论文标题

解析作为预训练

Parsing as Pretraining

论文作者

Vilares, David, Strzyz, Michalina, Søgaard, Anders, Gómez-Rodríguez, Carlos

论文摘要

最近的分析表明，用于语言建模的编码器捕获了某些形态词法结构。但是，对于单词向量的探测框架仍然没有报告标准设置（例如成分和依赖解析）的结果。本文解决了这个问题，并且（在英语上）仅依靠架构架构，并且没有解码。我们首先将组成和依赖解析作为序列标记。然后，我们使用单个馈送层将单词向量直接映射到编码线性化树的标签。这是用来：（i）看到只有预审计的编码器可以在语法建模上伸展多远，（ii）对不同单词矢量的语法敏感性进行了一些启示（通过冻结培训期间训练网络的权重）。为了进行评估，我们使用括号的F1得分和LAS，并分析跨长度和依赖性位移的表示之间的深入差异。总体结果超过了PTB上的现有序列标记解析器（93.5％）和端到端的EN-EWT UD（78.8％）。

Recent analyses suggest that encoders pretrained for language modeling capture certain morpho-syntactic structure. However, probing frameworks for word vectors still do not report results on standard setups such as constituent and dependency parsing. This paper addresses this problem and does full parsing (on English) relying only on pretraining architectures -- and no decoding. We first cast constituent and dependency parsing as sequence tagging. We then use a single feed-forward layer to directly map word vectors to labels that encode a linearized tree. This is used to: (i) see how far we can reach on syntax modelling with just pretrained encoders, and (ii) shed some light about the syntax-sensitivity of different word vectors (by freezing the weights of the pretraining network during training). For evaluation, we use bracketing F1-score and LAS, and analyze in-depth differences across representations for span lengths and dependency displacements. The overall results surpass existing sequence tagging parsers on the PTB (93.5%) and end-to-end EN-EWT UD (78.8%).

下载PDF全文

下载文献需遵守相关版权规定

论文标题