论文标题
一种识别所有这些的模型:来自具有不同标签集的NER模型的边际蒸馏
One Model to Recognize Them All: Marginal Distillation from NER Models with Different Tag Sets
论文作者
论文摘要
命名实体识别(NER)是现代语言理解管道中的基本组成部分。公共NER资源(如注释数据和模型服务)都可以在许多域中获得。但是,给定特定的下游应用程序,通常没有一个NER资源来支持所有所需的实体类型,因此用户必须利用具有不同标签集的多个资源。本文提出了一种边际蒸馏(Mardi)方法,用于培训具有不相交或异质标签集的资源的统一NER模型。与最近的作品相反,狂欢节仅需要访问预训练的模型,而不是原始的培训数据集。这种灵活性使使用医疗保健和金融等敏感领域变得更加容易。此外,我们的方法足以与不同的NER体系结构集成,包括本地模型(例如Bilstm)和全球模型(例如CRF)。两个基准数据集的实验表明,狂欢节与强大的边缘CRF基线相同,同时更加灵活,以所需的NER资源的形式进行。狂欢节还为渐进式NER任务设定了新的最新状态。狂欢节大大优于渐进式NER任务的开始模型。
Named entity recognition (NER) is a fundamental component in the modern language understanding pipeline. Public NER resources such as annotated data and model services are available in many domains. However, given a particular downstream application, there is often no single NER resource that supports all the desired entity types, so users must leverage multiple resources with different tag sets. This paper presents a marginal distillation (MARDI) approach for training a unified NER model from resources with disjoint or heterogeneous tag sets. In contrast to recent works, MARDI merely requires access to pre-trained models rather than the original training datasets. This flexibility makes it easier to work with sensitive domains like healthcare and finance. Furthermore, our approach is general enough to integrate with different NER architectures, including local models (e.g., BiLSTM) and global models (e.g., CRF). Experiments on two benchmark datasets show that MARDI performs on par with a strong marginal CRF baseline, while being more flexible in the form of required NER resources. MARDI also sets a new state of the art on the progressive NER task. MARDI significantly outperforms the start-of-the-art model on the task of progressive NER.