论文标题
Multiwoz 2.3:通过注释校正和共同参考注释增强了多域的多域对话数据集
MultiWOZ 2.3: A multi-domain task-oriented dialogue dataset enhanced with annotation corrections and co-reference annotation
论文作者
论文摘要
以任务为导向的对话系统取得了前所未有的进步,该系统由许多由许多公开可用的多Woz数据集的模型(SOTA)模型。对话状态注释容易出错,导致次优性能。在纠正原始多沃兹数据集中介绍的注释错误方面已经做出了各种努力。在本文中,我们介绍了Multiwoz 2.3,其中我们将对话行为中的错误注释与对话状态不同,从而在发布更新的数据集时确定了缺乏共同参考。为了确保对话行为和对话指出之间的一致性,我们实施了共同参考特征,并统一对话行为和对话状态的注释。我们更新了自然语言理解和对话状态在多沃兹2.3上的最新表现,其中结果比以前版本的Multiwoz数据集显示了显着改进(2.0-2.2)。
Task-oriented dialogue systems have made unprecedented progress with multiple state-of-the-art (SOTA) models underpinned by a number of publicly available MultiWOZ datasets. Dialogue state annotations are error-prone, leading to sub-optimal performance. Various efforts have been put in rectifying the annotation errors presented in the original MultiWOZ dataset. In this paper, we introduce MultiWOZ 2.3, in which we differentiate incorrect annotations in dialogue acts from dialogue states, identifying a lack of co-reference when publishing the updated dataset. To ensure consistency between dialogue acts and dialogue states, we implement co-reference features and unify annotations of dialogue acts and dialogue states. We update the state of the art performance of natural language understanding and dialogue state tracking on MultiWOZ 2.3, where the results show significant improvements than on previous versions of MultiWOZ datasets (2.0-2.2).