论文标题
大型多演员生成对话框建模
Large Scale Multi-Actor Generative Dialog Modeling
论文作者
论文摘要
非目标对话框(即聊天机器人)旨在与用户进行不同和引人入胜的对话;但是,他们通常在对话中表现出不一致的性格或所有用户的平均个性。本文通过通过对目标演员的先前对话来控制代理人的角色来解决这些问题。通过这样做,我们能够在一个人的语音中利用更多的抽象模式,并在产生的响应中更好地模仿它们。这项工作介绍了生成对话控制模型,这是一种增强且微调的GPT-2语言模型,该模型在过去的参考对话中条件,以对演员角色中的多转向对话进行模拟。我们介绍了一个随附的数据收集程序,以从价值6个月的Reddit评论中获得1030万个对话。我们证明,缩放模型的大小从1.17m到8.3b参数可从170万持有Reddit对话的17m上的23.14提高到13.14的困惑。增加模型量表在人类评估中产生了类似的改进,这些评估衡量了模型样本对现实主义的偏好而不是现实主义(31%增加到37%的偏爱),样式匹配(37%至42%),语法和内容质量(29%至42%)和对话相干性(32%至40%)。我们发现,在自动评估中,有条件地建模过去的对话会使困惑增加0.47。通过人类试验,我们确定有条件建模和样式匹配和轮廓步骤之间的积极趋势,以进一步改善角色控制。
Non-goal oriented dialog agents (i.e. chatbots) aim to produce varying and engaging conversations with a user; however, they typically exhibit either inconsistent personality across conversations or the average personality of all users. This paper addresses these issues by controlling an agent's persona upon generation via conditioning on prior conversations of a target actor. In doing so, we are able to utilize more abstract patterns within a person's speech and better emulate them in generated responses. This work introduces the Generative Conversation Control model, an augmented and fine-tuned GPT-2 language model that conditions on past reference conversations to probabilistically model multi-turn conversations in the actor's persona. We introduce an accompanying data collection procedure to obtain 10.3M conversations from 6 months worth of Reddit comments. We demonstrate that scaling model sizes from 117M to 8.3B parameters yields an improvement from 23.14 to 13.14 perplexity on 1.7M held out Reddit conversations. Increasing model scale yielded similar improvements in human evaluations that measure preference of model samples to the held out target distribution in terms of realism (31% increased to 37% preference), style matching (37% to 42%), grammar and content quality (29% to 42%), and conversation coherency (32% to 40%). We find that conditionally modeling past conversations improves perplexity by 0.47 in automatic evaluations. Through human trials we identify positive trends between conditional modeling and style matching and outline steps to further improve persona control.