通过大规模打字来解释的实体表示

论文标题

通过大规模打字来解释的实体表示

Interpretable Entity Representations through Large-Scale Typing

论文作者

Onoe, Yasumasa, Durrett, Greg

论文摘要

在自然语言处理的标准方法中，文本中的实体通常嵌入具有预训练模型的密集矢量空间中。当喂入下游模型时，以这种方式产生的嵌入是有效的，但是它们需要进行终端处理，并且根本难以解释。在本文中，我们提出了一种创建人类可读的实体表示的方法，并在开箱即用的实体相关任务上实现高性能。我们的表示是向量，其值对应于细粒度实体类型上的后验概率，这表明打字模型的决策实体属于相应类型的置信度。我们使用高细粒度实体键入模型获得这些表示，该模型对受监督的超细实体分型数据进行训练（Choi等人，2018年），或者是Wikipedia的遥不可及的示例。在实体探测涉及识别实体身份的任务上，我们在无参数下游模型中使用的嵌入在受过训练的模型中使用基于ELMO和BERT的嵌入来实现竞争性能。我们还表明，可以以基于学习的方式为特定领域减小类型的大小。最后，我们表明这些嵌入可以通过少量规则来修改事后，以纳入领域知识并提高性能。

In standard methodology for natural language processing, entities in text are typically embedded in dense vector spaces with pre-trained models. The embeddings produced this way are effective when fed into downstream models, but they require end-task fine-tuning and are fundamentally difficult to interpret. In this paper, we present an approach to creating entity representations that are human readable and achieve high performance on entity-related tasks out of the box. Our representations are vectors whose values correspond to posterior probabilities over fine-grained entity types, indicating the confidence of a typing model's decision that the entity belongs to the corresponding type. We obtain these representations using a fine-grained entity typing model, trained either on supervised ultra-fine entity typing data (Choi et al. 2018) or distantly-supervised examples from Wikipedia. On entity probing tasks involving recognizing entity identity, our embeddings used in parameter-free downstream models achieve competitive performance with ELMo- and BERT-based embeddings in trained models. We also show that it is possible to reduce the size of our type set in a learning-based way for particular domains. Finally, we show that these embeddings can be post-hoc modified through a small number of rules to incorporate domain knowledge and improve performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题