kg-bart：知识图形增强的巴特用于生成常识性推理

论文标题

kg-bart：知识图形增强的巴特用于生成常识性推理

KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning

论文作者

Liu, Ye, Wan, Yao, He, Lifang, Peng, Hao, Yu, Philip S.

论文摘要

旨在授权机器能够以推理能力生成一组概念的句子的生成常识性推理是文本生成的关键瓶颈。即使是最先进的预训练的语言生成模型在这项任务上也很挣扎，并且经常会产生令人难以置信的异常句子。原因之一是他们很少考虑合并知识图，这些图形可以在常识概念之间提供丰富的关系信息。为了促进常识性推理的文本生成能力，我们提出了一个新颖的知识图增强了预训练的语言生成模型KG-BART，该模型kg-bart包括通过知识图涵盖概念的复杂关系，并产生更合乎逻辑和自然的句子作为输出。此外，KG-BART可以利用图形的注意力来汇总丰富的概念语义，从而增强对看不见的概念集的模型概括。 Experiments on benchmark CommonGen dataset verify the effectiveness of our proposed approach by comparing with several strong pre-trained language generation models, particularly KG-BART outperforms BART by 5.80, 4.60, in terms of BLEU-3, 4. Moreover, we also show that the generated context by our model can work as background scenarios to benefit downstream commonsense QA tasks.

Generative commonsense reasoning which aims to empower machines to generate sentences with the capacity of reasoning over a set of concepts is a critical bottleneck for text generation. Even the state-of-the-art pre-trained language generation models struggle at this task and often produce implausible and anomalous sentences. One reason is that they rarely consider incorporating the knowledge graph which can provide rich relational information among the commonsense concepts. To promote the ability of commonsense reasoning for text generation, we propose a novel knowledge graph augmented pre-trained language generation model KG-BART, which encompasses the complex relations of concepts through the knowledge graph and produces more logical and natural sentences as output. Moreover, KG-BART can leverage the graph attention to aggregate the rich concept semantics that enhances the model generalization on unseen concept sets. Experiments on benchmark CommonGen dataset verify the effectiveness of our proposed approach by comparing with several strong pre-trained language generation models, particularly KG-BART outperforms BART by 5.80, 4.60, in terms of BLEU-3, 4. Moreover, we also show that the generated context by our model can work as background scenarios to benefit downstream commonsense QA tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题