使用后续行动的开放域对话框评估

论文标题

使用后续行动的开放域对话框评估

Open-Domain Dialog Evaluation using Follow-Ups Likelihood

论文作者

De Bruyn, Maxime, Lotfi, Ehsan, Buhmann, Jeska, Daelemans, Walter

论文摘要

开放域对话框的自动评估仍然是一个未解决的问题。此外，现有方法与人类注释没有密切相关。本文使用后续行动提出了一种新的自动化评估方法：我们衡量语言模型将继续使用固定的后续行动继续对话的可能性（例如，在这里不真正相关，您想说什么）。与现有的十二种方法进行比较时，我们的新评估与人类评估的最高相关性。

Automatic evaluation of open-domain dialogs remains an unsolved problem. Moreover, existing methods do not correlate strongly with human annotations. This paper presents a new automated evaluation method using follow-ups: we measure the probability that a language model will continue the conversation with a fixed set of follow-ups (e.g., not really relevant here, what are you trying to say). When compared against twelve existing methods, our new evaluation achieves the highest correlation with human evaluations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题