论文标题
SEVA的共同知识概念识别
Common-Knowledge Concept Recognition for SEVA
论文作者
论文摘要
我们为系统工程师的虚拟助手(SEVA)建立了一个共同的知识概念识别系统,可用于下游任务,例如提取关系,知识图构造和提问。该问题被称为类似于指定实体提取的令牌分类任务。借助域专家和文本处理方法,我们通过仔细定义标记方案来训练序列模型以识别系统工程概念,从而在单词级别上构建一个数据集。我们使用预先训练的语言模型,并将其与标记的概念数据集微调。此外,我们还创建了一些基本数据集,以供诸如系统工程域的缩写和定义等信息。最后,我们使用这些提取的概念以及一些信号关系来构建一个简单的知识图。
We build a common-knowledge concept recognition system for a Systems Engineer's Virtual Assistant (SEVA) which can be used for downstream tasks such as relation extraction, knowledge graph construction, and question-answering. The problem is formulated as a token classification task similar to named entity extraction. With the help of a domain expert and text processing methods, we construct a dataset annotated at the word-level by carefully defining a labelling scheme to train a sequence model to recognize systems engineering concepts. We use a pre-trained language model and fine-tune it with the labeled dataset of concepts. In addition, we also create some essential datasets for information such as abbreviations and definitions from the systems engineering domain. Finally, we construct a simple knowledge graph using these extracted concepts along with some hyponym relations.