文档级信息提取的自动错误分析

论文标题

文档级信息提取的自动错误分析

Automatic Error Analysis for Document-level Information Extraction

论文作者

Das, Aliva, Du, Xinya, Wang, Barry, Shi, Kejian, Gu, Jiayuan, Porter, Thomas, Cardie, Claire

论文摘要

文档级信息提取（IE）任务最近开始使用端到端的神经网络技术对其句子级别的IE对应物进行了成功。但是，对方法的评估在许多维度上受到限制。特别是，Precision/Recell/F1分数通常报道，几乎没有关于模型造成的错误范围的见解。我们以Kummerfeld和Klein（2013）的工作为基础，以在文档级事件和（N- ARY）关系提取中提出一个基于转换的框架，以自动化错误分析。我们采用我们的框架来比较来自三个域的数据集上的两种最先进的文档级模板填充方法；然后，为了衡量IE自30年前成立以来的进展，与MUC-4（1992）评估的四个系统相比。

Document-level information extraction (IE) tasks have recently begun to be revisited in earnest using the end-to-end neural network techniques that have been successful on their sentence-level IE counterparts. Evaluation of the approaches, however, has been limited in a number of dimensions. In particular, the precision/recall/F1 scores typically reported provide few insights on the range of errors the models make. We build on the work of Kummerfeld and Klein (2013) to propose a transformation-based framework for automating error analysis in document-level event and (N-ary) relation extraction. We employ our framework to compare two state-of-the-art document-level template-filling approaches on datasets from three domains; and then, to gauge progress in IE since its inception 30 years ago, vs. four systems from the MUC-4 (1992) evaluation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题