Dialguide：将对话模型行为与开发人员指南保持一致

论文标题

Dialguide：将对话模型行为与开发人员指南保持一致

DialGuide: Aligning Dialogue Model Behavior with Developer Guidelines

论文作者

Gupta, Prakhar, Liu, Yang, Jin, Di, Hedayatnia, Behnam, Gella, Spandana, Liu, Sijia, Lange, Patrick, Hirschberg, Julia, Hakkani-Tur, Dilek

论文摘要

对话模型能够产生连贯且流利的响应，但是它们仍然可能具有挑战性，并且可能会产生非参与，不安全的结果。这种不可预测的性会减少用户的信任，并可能阻碍现实世界中模型的使用。为了解决这个问题，我们介绍了Dialguide，这是一种使用自然语言规则或指南来控制对话模型行为的新颖框架。这些准则提供了有关它们适用的上下文以及应包含在响应中的上下文的信息，从而允许模型产生与开发人员的期望和意图更紧密相符的响应。我们在开放域对话响应生成中评估了三个任务的拨号指导：指南选择，响应生成和响应需要验证。我们的数据集包含10,737个正面和15,467个负面的对话环境 - 响应指标跨两个领域的三重态 - chit-chat和安全性。我们为任务提供基线模型，并基准测试其性能。我们还证明了Dialguide在对话安全域中有效，从而产生了遵循开发人员准则的安全且引人入胜的响应。

Dialogue models are able to generate coherent and fluent responses, but they can still be challenging to control and may produce non-engaging, unsafe results. This unpredictability diminishes user trust and can hinder the use of the models in the real world. To address this, we introduce DialGuide, a novel framework for controlling dialogue model behavior using natural language rules, or guidelines. These guidelines provide information about the context they are applicable to and what should be included in the response, allowing the models to generate responses that are more closely aligned with the developer's expectations and intent. We evaluate DialGuide on three tasks in open-domain dialogue response generation: guideline selection, response generation, and response entailment verification. Our dataset contains 10,737 positive and 15,467 negative dialogue context-response-guideline triplets across two domains - chit-chat and safety. We provide baseline models for the tasks and benchmark their performance. We also demonstrate that DialGuide is effective in the dialogue safety domain, producing safe and engaging responses that follow developer guidelines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题