论文标题
Monant Medical错误信息数据集:将文章映射到事实检查的主张
Monant Medical Misinformation Dataset: Mapping Articles to Fact-Checked Claims
论文作者
论文摘要
虚假信息对个人以及整个社会都有重大的负面影响。特别是在当前的Covid-19时代,我们目睹了医学错误信息的前所未有的增长。为了通过机器学习方法来解决这个问题,我们正在发布大约功能丰富的数据集。 317K医学新闻文章/博客和3.5k事实检查的主张。它还手动包含573个,并且超过51k自动标记了索赔和文章之间的映射。映射包括索赔存在,即是否在给定文章中包含索赔,以及对索赔的立场。我们为这两个任务提供了多个基线,并在数据集的手动标记部分上对其进行了评估。该数据集启用了许多与医学错误信息有关的其他任务,例如误导性表征研究或来源之间错误信息扩散的研究。
False information has a significant negative influence on individuals as well as on the whole society. Especially in the current COVID-19 era, we witness an unprecedented growth of medical misinformation. To help tackle this problem with machine learning approaches, we are publishing a feature-rich dataset of approx. 317k medical news articles/blogs and 3.5k fact-checked claims. It also contains 573 manually and more than 51k automatically labelled mappings between claims and articles. Mappings consist of claim presence, i.e., whether a claim is contained in a given article, and article stance towards the claim. We provide several baselines for these two tasks and evaluate them on the manually labelled part of the dataset. The dataset enables a number of additional tasks related to medical misinformation, such as misinformation characterisation studies or studies of misinformation diffusion between sources.