论文标题
Nela-GT-2019:一个大型多标签新闻数据集,用于研究新闻文章中的错误信息
NELA-GT-2019: A Large Multi-Labelled News Dataset for The Study of Misinformation in News Articles
论文作者
论文摘要
在本文中,我们介绍了Nela-GT-2018数据集(Nørregaard,Horne和Adalı2019)的更新版本,标题为Nela-GT-2019。 Nela-GT-2019在2019年1月1日至2019年12月31日之间收集的260个来源的新闻报道包含11.1万新闻文章。与Nela-GT-2018一样,这些来源来自广泛的主流新闻来源和替代新闻来源。数据集随附的是来自7个不同评估站点的源级地面真实标签,涵盖了多个真实性的多个维度。 Nela-GT-2019数据集可在以下网址找到:https://doi.org/10.7910/dvn/o7fwpo
In this paper, we present an updated version of the NELA-GT-2018 dataset (Nørregaard, Horne, and Adalı 2019), entitled NELA-GT-2019. NELA-GT-2019 contains 1.12M news articles from 260 sources collected between January 1st 2019 and December 31st 2019. Just as with NELA-GT-2018, these sources come from a wide range of mainstream news sources and alternative news sources. Included with the dataset are source-level ground truth labels from 7 different assessment sites covering multiple dimensions of veracity. The NELA-GT-2019 dataset can be found at: https://doi.org/10.7910/DVN/O7FWPO