MTOP：一种全面的多语言式的面向任务的语义解析基准

论文标题

MTOP：一种全面的多语言式的面向任务的语义解析基准

MTOP: A Comprehensive Multilingual Task-Oriented Semantic Parsing Benchmark

论文作者

Li, Haoran, Arora, Abhinav, Chen, Shuohui, Gupta, Anchit, Gupta, Sonal, Mehdad, Yashar

论文摘要

由于缺乏可用的数据集，将面向任务的对话系统的缩放语义解析模型通常昂贵且耗时。可用的数据集遭受了几个缺点的困扰：a）它们包含很少的语言b）每个语言中包含少量标记的示例c）它们基于用于非复合查询的简单意图和插槽检测范例。在本文中，我们提出了一个新的多语言数据集，称为MTOP，其中包含11个域上6种语言的100k注释的话语。我们使用此数据集和其他公开可用的数据集来进行全面的基准测试研究，以使用各种最先进的多语言预训练模型进行任务面向任务的语义解析。对于两个现有的多语言数据集的插槽F1，我们的平均提高了+6.3点，而实验中报告的最佳结果。此外，我们使用预先训练的模型与自动翻译和对齐方式结合使用了强劲的零射击性能，以及提出的远处监督方法，以减少插槽标签投影中的噪声。

Scaling semantic parsing models for task-oriented dialog systems to new languages is often expensive and time-consuming due to the lack of available datasets. Available datasets suffer from several shortcomings: a) they contain few languages b) they contain small amounts of labeled examples per language c) they are based on the simple intent and slot detection paradigm for non-compositional queries. In this paper, we present a new multilingual dataset, called MTOP, comprising of 100k annotated utterances in 6 languages across 11 domains. We use this dataset and other publicly available datasets to conduct a comprehensive benchmarking study on using various state-of-the-art multilingual pre-trained models for task-oriented semantic parsing. We achieve an average improvement of +6.3 points on Slot F1 for the two existing multilingual datasets, over best results reported in their experiments. Furthermore, we demonstrate strong zero-shot performance using pre-trained models combined with automatic translation and alignment, and a proposed distant supervision method to reduce the noise in slot label projection.

下载PDF全文

下载文献需遵守相关版权规定

论文标题