论文标题
您能把它们放在一起:评估会话代理人融合技能的能力
Can You Put it All Together: Evaluating Conversational Agents' Ability to Blend Skills
论文作者
论文摘要
吸引人,知识渊博和同情心都是对话剂的一般素质。先前的工作介绍了任务和数据集,旨在帮助代理商孤立地学习这些素质,并衡量它们表达它们的能力。但是,一个良好的开放域对话剂应该能够无缝将它们全部融合到一个凝聚力的对话流程中,而不是专门研究一种质量。在这项工作中,我们研究了几种方法,将训练有素的模型结合起来,涉及孤立功能的模型,从需要最小的额外培训的简单模型聚合方案到各种形式的多任务培训,这些培训涵盖了所有培训阶段的几个技能。我们进一步提出了一个新的数据集,即混合的SkillTalk,以分析这些功能如何在自然对话中融合在一起,并比较不同建筑和培训方案的性能。我们的实验表明,与对单个技能训练的模型相比,多个关注特定功能的多个任务的多任务会导致更好的混合对话性能,并且如果统一或两阶段的方法在构造方面既避免在技能选择方面进行不必要的偏见或对我们的新任务进行微调。
Being engaging, knowledgeable, and empathetic are all desirable general qualities in a conversational agent. Previous work has introduced tasks and datasets that aim to help agents to learn those qualities in isolation and gauge how well they can express them. But rather than being specialized in one single quality, a good open-domain conversational agent should be able to seamlessly blend them all into one cohesive conversational flow. In this work, we investigate several ways to combine models trained towards isolated capabilities, ranging from simple model aggregation schemes that require minimal additional training, to various forms of multi-task training that encompass several skills at all training stages. We further propose a new dataset, BlendedSkillTalk, to analyze how these capabilities would mesh together in a natural conversation, and compare the performance of different architectures and training schemes. Our experiments show that multi-tasking over several tasks that focus on particular capabilities results in better blended conversation performance compared to models trained on a single skill, and that both unified or two-stage approaches perform well if they are constructed to avoid unwanted bias in skill selection or are fine-tuned on our new task.