论文标题
DSTEA:通过实体自适应预训练改善对话状态跟踪
DSTEA: Improving Dialogue State Tracking via Entity Adaptive Pre-training
论文作者
论文摘要
对话状态跟踪(DST)对于全面解释用户和系统话语至关重要,从而形成了有效的对话系统的基石。尽管过去的研究工作着重于通过更改模型结构或整合图形关系(例如图形关系)来提高DST性能,但它们通常需要与外部对话Corpora进行额外的预培训。在这项研究中,我们提出了DSTEA,通过实体自适应预训练来改善对话状态跟踪,这可以通过对话说法中的深入培训关键实体来增强编码器。 DSTEA从输入对话中标识了这些关键实体,利用四种不同的方法:本体信息信息,命名 - 实体识别,Spacy和Flair库。随后,它采用选择性知识掩盖来有效地训练模型。值得注意的是,DSTEA仅需要在不直接注入DST模型中的额外知识的情况下进行预训练。这种方法可在多沃兹2.0、2.1和2.2上大大改善四个可靠的DST模型,共同目标准确性的增长率最高为2.69%(从52.41%增加到55.10%)。考虑到各种实体类型和不同实体自适应的预训练配置,例如掩盖策略和掩盖率,可以进一步验证DSTEA功效。
Dialogue State Tracking (DST) is critical for comprehensively interpreting user and system utterances, thereby forming the cornerstone of efficient dialogue systems. Despite past research efforts focused on enhancing DST performance through alterations to the model structure or integrating additional features like graph relations, they often require additional pre-training with external dialogue corpora. In this study, we propose DSTEA, improving Dialogue State Tracking via Entity Adaptive pre-training, which can enhance the encoder through by intensively training key entities in dialogue utterances. DSTEA identifies these pivotal entities from input dialogues utilizing four different methods: ontology information, named-entity recognition, the spaCy, and the flair library. Subsequently, it employs selective knowledge masking to train the model effectively. Remarkably, DSTEA only requires pre-training without the direct infusion of extra knowledge into the DST model. This approach resulted in substantial performance improvements of four robust DST models on MultiWOZ 2.0, 2.1, and 2.2, with joint goal accuracy witnessing an increase of up to 2.69% (from 52.41% to 55.10%). Further validation of DSTEA's efficacy was provided through comparative experiments considering various entity types and different entity adaptive pre-training configurations such as masking strategy and masking rate.