接地响应生成的可控模型

论文标题

接地响应生成的可控模型

A Controllable Model of Grounded Response Generation

论文作者

Wu, Zeqiu, Galley, Michel, Brockett, Chris, Zhang, Yizhe, Gao, Xiang, Quirk, Chris, Koncel-Kedziorski, Rik, Gao, Jianfeng, Hajishirzi, Hannaneh, Ostendorf, Mari, Dolan, Bill

论文摘要

当前的端到端神经对话模型固有地缺乏在响应生成过程中强加语义控制的灵活性，通常会导致响应无趣。仅仅以牺牲事实准确性为代价的尝试，试图通过验证的语言模型“幻觉”事实证明。尽管可以通过获得背景知识来减轻这种情况，但在产生的响应中可以保证相关性和信息性。我们提出了一个框架，我们称其为可控的接地响应生成（CGRG），其中用户可以提供词汇控制短语，或者是由对话环境和接地知识从控制短语预测器自动提取的。定量和定性的结果表明，使用此框架，一种基于变压器的模型，具有新型的感应性注意机制，在类似对话的Reddit数据集中训练，胜过强大的生成基线。

Current end-to-end neural conversation models inherently lack the flexibility to impose semantic control in the response generation process, often resulting in uninteresting responses. Attempts to boost informativeness alone come at the expense of factual accuracy, as attested by pretrained language models' propensity to "hallucinate" facts. While this may be mitigated by access to background knowledge, there is scant guarantee of relevance and informativeness in generated responses. We propose a framework that we call controllable grounded response generation (CGRG), in which lexical control phrases are either provided by a user or automatically extracted by a control phrase predictor from dialogue context and grounding knowledge. Quantitative and qualitative results show that, using this framework, a transformer based model with a novel inductive attention mechanism, trained on a conversation-like Reddit dataset, outperforms strong generation baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题