logiqa：用于使用逻辑推理的机器阅读理解的挑战数据集

论文标题

logiqa：用于使用逻辑推理的机器阅读理解的挑战数据集

LogiQA: A Challenge Dataset for Machine Reading Comprehension with Logical Reasoning

论文作者

Liu, Jian, Cui, Leyang, Liu, Hanmeng, Huang, Dandan, Wang, Yile, Zhang, Yue

论文摘要

机器阅读是测试自然语言理解能力的一项基本任务，这与许多方面与人类认知密切相关。随着深度学习技术的上升，算法模型与简单质量质量的人类表演相媲美，因此已经提出了越来越具有挑战性的机器阅读数据集。尽管已经整合了各种挑战，例如证据整合和常识性知识，但人类阅读中的基本能力之一，即逻辑推理，并未得到充分研究。我们构建了一个名为logiqa的综合数据集，该数据集来自用于测试人类逻辑推理的专家编写的问题。它由8,678个质量检查实例组成，涵盖了多种类型的演绎推理。结果表明，最新的神经模型的性能远比人类天花板差得多。我们的数据集还可以作为在深度学习NLP设置下重新研究逻辑AI的基准。该数据集可在https://github.com/lgw863/logiqa-dataset上免费获得

Machine reading is a fundamental task for testing the capability of natural language understanding, which is closely related to human cognition in many aspects. With the rising of deep learning techniques, algorithmic models rival human performances on simple QA, and thus increasingly challenging machine reading datasets have been proposed. Though various challenges such as evidence integration and commonsense knowledge have been integrated, one of the fundamental capabilities in human reading, namely logical reasoning, is not fully investigated. We build a comprehensive dataset, named LogiQA, which is sourced from expert-written questions for testing human Logical reasoning. It consists of 8,678 QA instances, covering multiple types of deductive reasoning. Results show that state-of-the-art neural models perform by far worse than human ceiling. Our dataset can also serve as a benchmark for reinvestigating logical AI under the deep learning NLP setting. The dataset is freely available at https://github.com/lgw863/LogiQA-dataset

下载PDF全文

下载文献需遵守相关版权规定

论文标题