COCOLM：复杂的常识增强语言模型与话语关系

论文标题

COCOLM：复杂的常识增强语言模型与话语关系

CoCoLM: COmplex COmmonsense Enhanced Language Model with Discourse Relations

论文作者

Yu, Changlong, Zhang, Hongming, Song, Yangqiu, Ng, Wilfred

论文摘要

大规模的预训练的语言模型已表现出强大的知识表示能力。然而，最近的研究表明，即使这些巨大的模型包含丰富的简化知识（例如，鸟可以飞行和鱼可以游泳。），它们经常在涉及多种意外情况的复杂常识知识上挣扎，这些知识涉及多种情况（动词为中心的短语（例如，例如，bob yells of Bob'''''和`````'''''''''''''''''''''''''''''''''''''''结合复杂的常识知识。与现有的微调方法不同，我们不关注特定任务，而是提出了一个名为COCOLM的通用语言模型。通过对大规模的最终知识图表的仔细培训，我们成功地教授了预培训的语言模型（即Bert和Roberta）丰富的复杂常识性知识。在多个下游常识任务上进行的实验，需要正确理解现实情况，这表明了COCOLM的有效性。

Large-scale pre-trained language models have demonstrated strong knowledge representation ability. However, recent studies suggest that even though these giant models contains rich simple commonsense knowledge (e.g., bird can fly and fish can swim.), they often struggle with the complex commonsense knowledge that involves multiple eventualities (verb-centric phrases, e.g., identifying the relationship between ``Jim yells at Bob'' and ``Bob is upset'').To address this problem, in this paper, we propose to help pre-trained language models better incorporate complex commonsense knowledge. Different from existing fine-tuning approaches, we do not focus on a specific task and propose a general language model named CoCoLM. Through the careful training over a large-scale eventuality knowledge graphs ASER, we successfully teach pre-trained language models (i.e., BERT and RoBERTa) rich complex commonsense knowledge among eventualities. Experiments on multiple downstream commonsense tasks that requires the correct understanding of eventualities demonstrate the effectiveness of CoCoLM.

下载PDF全文

下载文献需遵守相关版权规定

论文标题