论文标题

FLERT:命名实体识别的文档级功能

FLERT: Document-Level Features for Named Entity Recognition

论文作者

Schweter, Stefan, Akbik, Alan

论文摘要

命名实体识别(NER)的当前最新方法通常在句子级别上考虑文本,因此不会对跨句子边界的信息进行建模。但是,将基于变压器的模型用于NER提供了自然选择,用于捕获文档级功能。在本文中,我们对文献中通常考虑的两个标准NER体系结构中的文档级特征进行了比较评估,即“微调”和“基于功能的LSTM-CRF”。我们评估了文档级功能(例如上下文窗口大小和执行文档局部性)的不同超标剂。我们介绍了实验,我们从中提出了有关如何建模文档上下文并在几个Conll-03基准数据集上介绍新的最先进分数的建议。我们的方法被整合到Flair框架中,以促进我们的实验复制。

Current state-of-the-art approaches for named entity recognition (NER) typically consider text at the sentence-level and thus do not model information that crosses sentence boundaries. However, the use of transformer-based models for NER offers natural options for capturing document-level features. In this paper, we perform a comparative evaluation of document-level features in the two standard NER architectures commonly considered in the literature, namely "fine-tuning" and "feature-based LSTM-CRF". We evaluate different hyperparameters for document-level features such as context window size and enforcing document-locality. We present experiments from which we derive recommendations for how to model document context and present new state-of-the-art scores on several CoNLL-03 benchmark datasets. Our approach is integrated into the Flair framework to facilitate reproduction of our experiments.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源