论文标题
部分可观测时空混沌系统的无模型预测
Task2Dial: A Novel Task and Dataset for Commonsense enhanced Task-based Dialogue Grounded in Documents
论文作者
论文摘要
本文提出了一项关于基于常识增强的基于任务的对话的新任务,该对话基于文档,并描述了Task2Dial数据集,这是一个基于文档的基于任务的对话的新型数据集,信息egiver(IG)在其中提供了指令(通过咨询文档(通过)向信息follower(如果)咨询文档(如果),以便后者可以成功完成任务。在这种独特的环境中,IF可以提出澄清问题,这些问题可能不会基于基础文档,并且需要回答常识性知识。 Task2Dial数据集提出了新的挑战:(1)其人类参考文本比其他文档接地的对话数据集显示出更多的词汇丰富性和变化; (2)从此集合生成需要释义,因为可能已经从基础文档中修改了教学响应; (3)需要常识性知识,因为问题可能不一定基于文件; (4)生成需要基于上下文的计划,因为需要按顺序提供任务步骤。 Task2Dial数据集包含对话,平均每回合的$ 18.15 $转弯数和19.79个令牌,而现有数据集则分别为12.94和12。因此,从该数据集中学习有望更自然,多样化且类似于模板的系统话语。
This paper proposes a novel task on commonsense-enhanced task-based dialogue grounded in documents and describes the Task2Dial dataset, a novel dataset of document-grounded task-based dialogues, where an Information Giver (IG) provides instructions (by consulting a document) to an Information Follower (IF), so that the latter can successfully complete the task. In this unique setting, the IF can ask clarification questions which may not be grounded in the underlying document and require commonsense knowledge to be answered. The Task2Dial dataset poses new challenges: (1) its human reference texts show more lexical richness and variation than other document-grounded dialogue datasets; (2) generating from this set requires paraphrasing as instructional responses might have been modified from the underlying document; (3) requires commonsense knowledge, since questions might not necessarily be grounded in the document; (4) generating requires planning based on context, as task steps need to be provided in order. The Task2Dial dataset contains dialogues with an average $18.15$ number of turns and 19.79 tokens per turn, as compared to 12.94 and 12 respectively in existing datasets. As such, learning from this dataset promises more natural, varied and less template-like system utterances.