论文标题
对话状态跟踪的双重学习
Dual Learning for Dialogue State Tracking
论文作者
论文摘要
在以任务为导向的多转化对话系统中,对话状态是指在对话历史上的用户目标的紧凑表示。对话状态跟踪(DST)是在每个回合时估计对话状态。由于对复杂的对话历史上下文的依赖,DST数据注释比单句话的语言理解更昂贵,这使得任务更具挑战性。在这项工作中,我们将DST作为序列生成问题提出,并提出了一个新型的双学习框架,以充分利用未标记的数据。在双学习框架中,有两个代理:原始跟踪器代理(言语对状态发电机)和双说话发电机代理(状态对植物剂)。与传统的监督学习框架相比,双重学习可以分别通过重建错误和奖励信号迭代更新两个代理,而无需标记数据。奖励稀疏性问题在以前的DST方法中很难解决。在这项工作中,DST作为序列生成模型的重新制作有效地减轻了这一问题。我们称此原始跟踪器代理Dual-DST。 MultiWoz2.1数据集的实验结果表明,所提出的Dual-DST效果很好,尤其是当标记的数据受到限制时。它与完全使用标记数据的系统达到了可比的性能。
In task-oriented multi-turn dialogue systems, dialogue state refers to a compact representation of the user goal in the context of dialogue history. Dialogue state tracking (DST) is to estimate the dialogue state at each turn. Due to the dependency on complicated dialogue history contexts, DST data annotation is more expensive than single-sentence language understanding, which makes the task more challenging. In this work, we formulate DST as a sequence generation problem and propose a novel dual-learning framework to make full use of unlabeled data. In the dual-learning framework, there are two agents: the primal tracker agent (utterance-to-state generator) and the dual utterance generator agent (state-to-utterance genera-tor). Compared with traditional supervised learning framework, dual learning can iteratively update both agents through the reconstruction error and reward signal respectively without labeled data. Reward sparsity problem is hard to solve in previous DST methods. In this work, the reformulation of DST as a sequence generation model effectively alleviates this problem. We call this primal tracker agent dual-DST. Experimental results on MultiWOZ2.1 dataset show that the proposed dual-DST works very well, especially when labelled data is limited. It achieves comparable performance to the system where labeled data is fully used.