论文标题
蛋白石:端到端以任务对话的本体学意识到的语言模型
OPAL: Ontology-Aware Pretrained Language Model for End-to-End Task-Oriented Dialogue
论文作者
论文摘要
本文介绍了端到端以任务为导向的对话(TOD)的本体学验证的语言模型(OPAL)。与Chit-Chat对话模型不同,以任务为导向的对话模型满足了至少两个特定于任务的模块:对话状态跟踪器(DST)和响应生成器(RG)。对话状态由域插槽值三元组成,这些三元组被视为用户搜索与域相关数据库的约束。具有带注释的结构化对话状态的大规模面向任务的对话数据通常是无法访问的。它防止了以任务为导向对话的审计语言模型的开发。我们提出了一种简单而有效的预处理方法来减轻此问题,该方法包括两个训练阶段。第一阶段是在大规模上下文的文本数据上预处理,其中文本的结构化信息是由信息提取工具提取的。为了弥合训练方法和下游任务之间的差距,我们设计了两个训练训练的任务:类似于本体的三重恢复和下文生成,分别模拟了DST和RG。第二阶段是在TOD数据上微调验证的模型。实验结果表明,即使没有CAMREST676和MULTIWOZ基准的任何TOD数据,我们提出的方法即使没有任何TOD数据,我们提出的方法也可以提高并获得竞争性能。
This paper presents an ontology-aware pretrained language model (OPAL) for end-to-end task-oriented dialogue (TOD). Unlike chit-chat dialogue models, task-oriented dialogue models fulfill at least two task-specific modules: dialogue state tracker (DST) and response generator (RG). The dialogue state consists of the domain-slot-value triples, which are regarded as the user's constraints to search the domain-related databases. The large-scale task-oriented dialogue data with the annotated structured dialogue state usually are inaccessible. It prevents the development of the pretrained language model for the task-oriented dialogue. We propose a simple yet effective pretraining method to alleviate this problem, which consists of two pretraining phases. The first phase is to pretrain on large-scale contextual text data, where the structured information of the text is extracted by the information extracting tool. To bridge the gap between the pretraining method and downstream tasks, we design two pretraining tasks: ontology-like triple recovery and next-text generation, which simulates the DST and RG, respectively. The second phase is to fine-tune the pretrained model on the TOD data. The experimental results show that our proposed method achieves an exciting boost and get competitive performance even without any TOD data on CamRest676 and MultiWOZ benchmarks.