生成事实检查简介

论文标题

生成事实检查简介

Generating Fact Checking Briefs

论文作者

Fan, Angela, Piktus, Aleksandra, Petroni, Fabio, Wenzek, Guillaume, Saeidi, Marzieh, Vlachos, Andreas, Bordes, Antoine, Riedel, Sebastian

论文摘要

大规模检查事实是很困难的 - 尽管积极的事实检查网站的数量正在增长，但对于当代媒体生态系统的需求仍然太小了。但是，尽管有良好的意图，但志愿者的贡献通常容易出错，因此实际上仅限于索赔检测。我们研究了如何以自然语言摘要的形式在执行检查之前提供有关索赔的信息来提高事实检查的准确性和效率。我们调查了基于段落的简介，其中包含来自Wikipedia的相关段落，以实体为中心的段落，其中包括上述实体的Wikipedia页面，以及提问的简介，以及分解索赔的问题及其答案。为了产生Qabriefs，我们开发了Qabriefer，该模型会生成一组以索赔为条件的问题，搜索网络以获取证据并生成答案。为了训练其组件，我们介绍了我们通过众包收集的Qabriefdataset。我们表明，用简介（尤其是Qabriefs）进行核对的事实将人群工人的准确性提高了10％，同时略微降低了所花费的时间。对于志愿者（无薪）事实检查员，Qabriefs略有提高准确性，并将所需的时间降低约20％。

Fact checking at scale is difficult -- while the number of active fact checking websites is growing, it remains too small for the needs of the contemporary media ecosystem. However, despite good intentions, contributions from volunteers are often error-prone, and thus in practice restricted to claim detection. We investigate how to increase the accuracy and efficiency of fact checking by providing information about the claim before performing the check, in the form of natural language briefs. We investigate passage-based briefs, containing a relevant passage from Wikipedia, entity-centric ones consisting of Wikipedia pages of mentioned entities, and Question-Answering Briefs, with questions decomposing the claim, and their answers. To produce QABriefs, we develop QABriefer, a model that generates a set of questions conditioned on the claim, searches the web for evidence, and generates answers. To train its components, we introduce QABriefDataset which we collected via crowdsourcing. We show that fact checking with briefs -- in particular QABriefs -- increases the accuracy of crowdworkers by 10% while slightly decreasing the time taken. For volunteer (unpaid) fact checkers, QABriefs slightly increase accuracy and reduce the time required by around 20%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题