论文标题

部分可观测时空混沌系统的无模型预测

JamPatoisNLI: A Jamaican Patois Natural Language Inference Dataset

论文作者

Armstrong, Ruth-Ann, Hewitt, John, Manning, Christopher

论文摘要

Jampatoisnli提供了首个以克里奥尔语言的牙买加帕托瓦人的自然语言推断的数据集。许多口语最低的低资源语言都是克里奥尔语。这些语言通常具有源自主要世界语言的词典,并且具有独特的语法,反映了原始说话者的语言以及通过克雷奥尔化的语言出生过程。这使他们在探索从大型单语或多语言审核模型转移的有效性方面有一个独特的位置。尽管我们的工作以及以前的工作表明,从这些模型转移到与培训集中语言无关的低资源语言的转移不是很有效,但我们希望从转移到Creoles的更强结果。的确,我们的实验表明,jampatoisnli的少数学习比对这种无关的语言更好,并帮助我们开始理解克里奥尔语及其高资源基本语言之间的独特关系如何影响跨语性转移。 Jampatoisnli由自然存在的前提和专家写的假设组成,是将研究转向传统上服务不足的语言的一步,也是理解跨语言NLP的有用基准。

JamPatoisNLI provides the first dataset for natural language inference in a creole language, Jamaican Patois. Many of the most-spoken low-resource languages are creoles. These languages commonly have a lexicon derived from a major world language and a distinctive grammar reflecting the languages of the original speakers and the process of language birth by creolization. This gives them a distinctive place in exploring the effectiveness of transfer from large monolingual or multilingual pretrained models. While our work, along with previous work, shows that transfer from these models to low-resource languages that are unrelated to languages in their training set is not very effective, we would expect stronger results from transfer to creoles. Indeed, our experiments show considerably better results from few-shot learning of JamPatoisNLI than for such unrelated languages, and help us begin to understand how the unique relationship between creoles and their high-resource base languages affect cross-lingual transfer. JamPatoisNLI, which consists of naturally-occurring premises and expert-written hypotheses, is a step towards steering research into a traditionally underserved language and a useful benchmark for understanding cross-lingual NLP.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源