论文标题

思想飞跃:教授预先训练的模型,以系统地推理隐式知识

Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge

论文作者

Talmor, Alon, Tafjord, Oyvind, Clark, Peter, Goldberg, Yoav, Berant, Jonathan

论文摘要

神经网络在多大程度上可以系统地推理符号事实?证据表明,大型的预训练的语言模型(LMS)具有某些推理能力,但是这种能力很难控制。最近,已经表明,基于变压器的模型成功地在“封闭世界”的假设下对明确的符号事实进行了一致的推理。但是,在开放域设置中,希望利用已经在预训练的LMS参数中编码的隐含知识的庞大储层。在这项工作中,我们提供了首次演示,即可以对LMS进行培训,以可靠地进行系统的推理,以结合隐式,预先训练的知识和明确的自然语言陈述。为此,我们描述了一种程序,用于自动生成教授模型新推理技能的数据集,并证明模型学会有效地执行推理,涉及隐式分类学和世界知识,链接和计数。最后,我们证明了“教学”模型以推理超出培训分布的推广:他们成功地构成了单个示例中多种推理技能的用法。我们的工作铺平了通往开放域系统的途径,该系统通过与可以通过添加简单的自然语言语句来立即纠正模型的用户进行互动来不断改进。

To what extent can a neural network systematically reason over symbolic facts? Evidence suggests that large pre-trained language models (LMs) acquire some reasoning capacity, but this ability is difficult to control. Recently, it has been shown that Transformer-based models succeed in consistent reasoning over explicit symbolic facts, under a "closed-world" assumption. However, in an open-domain setup, it is desirable to tap into the vast reservoir of implicit knowledge already encoded in the parameters of pre-trained LMs. In this work, we provide a first demonstration that LMs can be trained to reliably perform systematic reasoning combining both implicit, pre-trained knowledge and explicit natural language statements. To do this, we describe a procedure for automatically generating datasets that teach a model new reasoning skills, and demonstrate that models learn to effectively perform inference which involves implicit taxonomic and world knowledge, chaining and counting. Finally, we show that "teaching" models to reason generalizes beyond the training distribution: they successfully compose the usage of multiple reasoning skills in single examples. Our work paves a path towards open-domain systems that constantly improve by interacting with users who can instantly correct a model by adding simple natural language statements.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源