论文标题
有效的二阶TreeCRF用于神经依赖性解析
Efficient Second-Order TreeCRF for Neural Dependency Parsing
论文作者
论文摘要
在深度学习时代(DL)时代,由于多层比尔斯特姆在上下文表示中的显着能力,解析模型的性能几乎没有受到伤害。作为最流行的基于图的依赖性解析器,由于其高效率和性能,Biaffine解析器在ARC-Factorization假设下直接得分单个依赖项,并采用了非常简单的本地令牌跨透明培训损失。本文首次向Biaffine Parser提出了二阶Treecrf扩展。很长一段时间以来,Inside-Outside算法的复杂性和低效率阻碍了Treecrf的流行。为了解决此问题,我们提出了一种有效的方法,用于批准GPU上直接大型矩阵操作的内部和Viterbi算法,并通过有效的后传播避免使用复杂的外部算法。来自13种语言的27个数据集的实验和分析清楚地表明,在DL时代之前开发的技术,例如结构学习(全球TREECRF损失)和高阶建模仍然有用,并且可以进一步提高最先进的Biaffine Parser的解析性能,尤其是对于部分评价的培训数据。我们在https://github.com/yzhangcs/crfpar上发布代码。
In the deep learning (DL) era, parsing models are extremely simplified with little hurt on performance, thanks to the remarkable capability of multi-layer BiLSTMs in context representation. As the most popular graph-based dependency parser due to its high efficiency and performance, the biaffine parser directly scores single dependencies under the arc-factorization assumption, and adopts a very simple local token-wise cross-entropy training loss. This paper for the first time presents a second-order TreeCRF extension to the biaffine parser. For a long time, the complexity and inefficiency of the inside-outside algorithm hinder the popularity of TreeCRF. To address this issue, we propose an effective way to batchify the inside and Viterbi algorithms for direct large matrix operation on GPUs, and to avoid the complex outside algorithm via efficient back-propagation. Experiments and analysis on 27 datasets from 13 languages clearly show that techniques developed before the DL era, such as structural learning (global TreeCRF loss) and high-order modeling are still useful, and can further boost parsing performance over the state-of-the-art biaffine parser, especially for partially annotated training data. We release our code at https://github.com/yzhangcs/crfpar.