检测具有变形金刚和TF-IDF的COVID-19阴谋论

论文标题

检测具有变形金刚和TF-IDF的COVID-19阴谋论

Detecting COVID-19 Conspiracy Theories with Transformers and TF-IDF

论文作者

Guo, Haoming, Huang, Tianyi, Huang, Huixuan, Fan, Mingyue, Friedland, Gerald

论文摘要

在社交媒体上共享假新闻和阴谋论具有广泛的负面影响。通过设计和应用不同的机器学习模型，研究人员在检测文本的假新闻方面取得了进步。但是，现有研究非常重视一般的常识性假新闻，而实际上假新闻通常涉及快速改变主题和特定于领域的词汇。在本文中，我们在中世纪基准2021上介绍了三个虚假新闻检测任务的方法和结果，该任务特别涉及Covid-19相关主题。我们尝试一组基于文本的模型，包括支持向量机，随机森林，伯特和罗伯塔。我们发现，预训练的变压器会产生最佳的验证结果，但是具有智能设计的随机初始化变压器也可以训练以达到接近预训练的变压器的精度。

The sharing of fake news and conspiracy theories on social media has wide-spread negative effects. By designing and applying different machine learning models, researchers have made progress in detecting fake news from text. However, existing research places a heavy emphasis on general, common-sense fake news, while in reality fake news often involves rapidly changing topics and domain-specific vocabulary. In this paper, we present our methods and results for three fake news detection tasks at MediaEval benchmark 2021 that specifically involve COVID-19 related topics. We experiment with a group of text-based models including Support Vector Machines, Random Forest, BERT, and RoBERTa. We find that a pre-trained transformer yields the best validation results, but a randomly initialized transformer with smart design can also be trained to reach accuracies close to that of the pre-trained transformer.

下载PDF全文

下载文献需遵守相关版权规定

论文标题