对比度学习减少了对话中的幻觉

论文标题

对比度学习减少了对话中的幻觉

Contrastive Learning Reduces Hallucination in Conversations

论文作者

Sun, Weiwei, Shi, Zhengliang, Gao, Shen, Ren, Pengjie, de Rijke, Maarten, Ren, Zhaochun

论文摘要

预训练的语言模型（LMS）将知识存储在其参数中，并且在对话系统中使用时可以产生信息的响应。但是，LMS遭受了“幻觉：”问题的问题，它们可能会产生不相关或实际上不正确的合理陈述。为了解决这个问题，我们提出了一种名为MixCl的对比学习方案。提出了一个新颖的混合对比目标，以明确优化LMS的隐式知识启发过程，从而减少其在对话中的幻觉。我们还研究了检索到的艰苦负面因素和模型产生的负面的负面抽样策略。我们对Wikipedia的Wizard进行实验，该实验是公众开放域知识的对话基准，并评估Mixcl的有效性。 MixCL有效地减少了LMS在对话中的幻觉，并在相关性和事实方面取得了基于LM的对话代理的最高性能。我们表明，MixCL可以在效率和可扩展性方面具有显着的优势，在享有显着优势的同时，获得了可比的性能。

Pre-trained language models (LMs) store knowledge in their parameters and can generate informative responses when used in conversational systems. However, LMs suffer from the problem of "hallucination:" they may generate plausible-looking statements that are irrelevant or factually incorrect. To address this problem, we propose a contrastive learning scheme, named MixCL. A novel mixed contrastive objective is proposed to explicitly optimize the implicit knowledge elicitation process of LMs, and thus reduce their hallucination in conversations. We also examine negative sampling strategies of retrieved hard negatives and model-generated negatives. We conduct experiments on Wizard-of-Wikipedia, a public, open-domain knowledge-grounded dialogue benchmark, and assess the effectiveness of MixCL. MixCL effectively reduces the hallucination of LMs in conversations and achieves the highest performance among LM-based dialogue agents in terms of relevancy and factuality. We show that MixCL achieves comparable performance to state-of-the-art KB-based approaches while enjoying notable advantages in terms of efficiency and scalability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题