Monant Medical错误信息数据集：将文章映射到事实检查的主张

论文标题

Monant Medical错误信息数据集：将文章映射到事实检查的主张

Monant Medical Misinformation Dataset: Mapping Articles to Fact-Checked Claims

论文作者

Srba, Ivan, Pecher, Branislav, Tomlein, Matus, Moro, Robert, Stefancova, Elena, Simko, Jakub, Bielikova, Maria

论文摘要

虚假信息对个人以及整个社会都有重大的负面影响。特别是在当前的Covid-19时代，我们目睹了医学错误信息的前所未有的增长。为了通过机器学习方法来解决这个问题，我们正在发布大约功能丰富的数据集。 317K医学新闻文章/博客和3.5k事实检查的主张。它还手动包含573个，并且超过51k自动标记了索赔和文章之间的映射。映射包括索赔存在，即是否在给定文章中包含索赔，以及对索赔的立场。我们为这两个任务提供了多个基线，并在数据集的手动标记部分上对其进行了评估。该数据集启用了许多与医学错误信息有关的其他任务，例如误导性表征研究或来源之间错误信息扩散的研究。

False information has a significant negative influence on individuals as well as on the whole society. Especially in the current COVID-19 era, we witness an unprecedented growth of medical misinformation. To help tackle this problem with machine learning approaches, we are publishing a feature-rich dataset of approx. 317k medical news articles/blogs and 3.5k fact-checked claims. It also contains 573 manually and more than 51k automatically labelled mappings between claims and articles. Mappings consist of claim presence, i.e., whether a claim is contained in a given article, and article stance towards the claim. We provide several baselines for these two tasks and evaluate them on the manually labelled part of the dataset. The dataset enables a number of additional tasks related to medical misinformation, such as misinformation characterisation studies or studies of misinformation diffusion between sources.

下载PDF全文

下载文献需遵守相关版权规定

论文标题