关于事实提取和验证的审查

论文标题

关于事实提取和验证的审查

A Review on Fact Extraction and Verification

论文作者

Bekoulis, Giannis, Papagiannopoulou, Christina, Deligiannis, Nikos

论文摘要

我们研究了事实检查问题，该问题旨在确定给定主张的真实性。具体而言，我们专注于事实提取和验证（发烧）及其伴随数据集的任务。该任务包括从Wikipedia检索相关文件（和句子）的子任务，并验证文件中的信息是否支持或驳斥给定的索赔。此任务是必不可少的，可以是诸如假新闻检测和医疗要求验证之类的应用程序的基础。在本文中，我们旨在通过以结构化和全面的方式介绍文献来更好地理解任务的挑战。我们通过分析不同方法的技术观点并讨论发烧数据集的性能结果来描述所提出的方法，这是关于事实提取和验证任务的最精心良好和正式结构化的数据集。迄今为止，我们还进行了最大的实验研究，以确定句子检索成分的有益损失函数。我们的分析表明，对负面句子进行取样对于改善性能和降低计算复杂性很重要。最后，我们描述了开放的问题和未来的挑战，并激发了未来的任务研究。

We study the fact checking problem, which aims to identify the veracity of a given claim. Specifically, we focus on the task of Fact Extraction and VERification (FEVER) and its accompanied dataset. The task consists of the subtasks of retrieving the relevant documents (and sentences) from Wikipedia and validating whether the information in the documents supports or refutes a given claim. This task is essential and can be the building block of applications such as fake news detection and medical claim verification. In this paper, we aim at a better understanding of the challenges of the task by presenting the literature in a structured and comprehensive way. We describe the proposed methods by analyzing the technical perspectives of the different approaches and discussing the performance results on the FEVER dataset, which is the most well-studied and formally structured dataset on the fact extraction and verification task. We also conduct the largest experimental study to date on identifying beneficial loss functions for the sentence retrieval component. Our analysis indicates that sampling negative sentences is important for improving the performance and decreasing the computational complexity. Finally, we describe open issues and future challenges, and we motivate future research in the task.

下载PDF全文

下载文献需遵守相关版权规定

论文标题