使用单词混乱网络建模对话状态跟踪的ASR歧义

论文标题

使用单词混乱网络建模对话状态跟踪的ASR歧义

Modeling ASR Ambiguity for Dialogue State Tracking Using Word Confusion Networks

论文作者

Pal, Vaishali, Guillot, Fabien, Shrivastava, Manish, Renders, Jean-Michel, Besacier, Laurent

论文摘要

口语对话系统通常使用Top-N ASR假设列表来推断语义含义并跟踪对话的状态。但是，与顶级N ASR列表相比，ASR图（例如混淆网络（Confnets））提供了更丰富的假设空间的紧凑表示。在本文中，我们研究了使用最先进的神经对话州跟踪器（DST）使用混乱网络的好处。我们使用注意与任何DST系统一起使用的注意混淆网络编码器编码二维聚会将嵌入的一维序列编码为1维序列。与使用TOP-N ASR假设相比，我们的Confnet编码器被插入了DST的最先进的“全球局部自动对话状态攻击者”（GLAD）模型，并获得了准确性和推理时间的显着改善。

Spoken dialogue systems typically use a list of top-N ASR hypotheses for inferring the semantic meaning and tracking the state of the dialogue. However ASR graphs, such as confusion networks (confnets), provide a compact representation of a richer hypothesis space than a top-N ASR list. In this paper, we study the benefits of using confusion networks with a state-of-the-art neural dialogue state tracker (DST). We encode the 2-dimensional confnet into a 1-dimensional sequence of embeddings using an attentional confusion network encoder which can be used with any DST system. Our confnet encoder is plugged into the state-of-the-art 'Global-locally Self-Attentive Dialogue State Tacker' (GLAD) model for DST and obtains significant improvements in both accuracy and inference time compared to using top-N ASR hypotheses.

下载PDF全文

下载文献需遵守相关版权规定

论文标题