声称值得检测为积极的未标记学习

论文标题

声称值得检测为积极的未标记学习

Claim Check-Worthiness Detection as Positive Unlabelled Learning

论文作者

Wright, Dustin, Augenstein, Isabelle

论文摘要

作为自动事实检查的第一步，请求检查性检测是事实检查系统的关键组成部分。有多种研究研究这个问题的研究：从政治演讲和辩论，Twitter上的谣言发现以及Wikipedia的引文中，访问的谣言排名。迄今为止，尚无对这些任务的结构性比较来了解它们的相关性，也没有对所有这些方法的统一方法的调查。在这项工作中，我们阐明了所有这些任务的基础索赔检测的核心挑战，因为它们取决于它们既可以检测到句子的事实程度，又要在没有验证的情况下相信句子的可能性。因此，注释者仅标志着他们认为的那些实例是明确的检查。我们最佳性能方法是一种统一的方法，它使用一种正面未标记的学习来自动纠正此方法，该方法发现了错误标记为不值得检查的实例。在应用此过程中，我们在研究的三个任务中的两个任务中，我们都表现出了艺术状态，以索取英语的要求检验。

As the first step of automatic fact checking, claim check-worthiness detection is a critical component of fact checking systems. There are multiple lines of research which study this problem: check-worthiness ranking from political speeches and debates, rumour detection on Twitter, and citation needed detection from Wikipedia. To date, there has been no structured comparison of these various tasks to understand their relatedness, and no investigation into whether or not a unified approach to all of them is achievable. In this work, we illuminate a central challenge in claim check-worthiness detection underlying all of these tasks, being that they hinge upon detecting both how factual a sentence is, as well as how likely a sentence is to be believed without verification. As such, annotators only mark those instances they judge to be clear-cut check-worthy. Our best performing method is a unified approach which automatically corrects for this using a variant of positive unlabelled learning that finds instances which were incorrectly labelled as not check-worthy. In applying this, we out-perform the state of the art in two of the three tasks studied for claim check-worthiness detection in English.

下载PDF全文

下载文献需遵守相关版权规定

论文标题