论文标题
使用基础架构功能识别虚假信息网站
Identifying Disinformation Websites Using Infrastructure Features
论文作者
论文摘要
平台一直在努力跟上虚假信息的传播。当前的响应诸如用户报告,手动分析和第三方事实检查较慢且难以扩展,因此,创建后的虚假信息可能会在一段时间内不受限制地传播。自动化对于使平台能够迅速响应虚假信息至关重要。在这项工作中,我们探索了自动检测虚假信息网站的新方向:基础架构功能。我们的假设是,尽管虚假信息网站可能在感知上与真实的新闻网站相似,但域注册,TLS/SSL证书和Web托管配置也可能存在明显的非感知差异。基础架构功能对于检测虚假信息网站特别有价值,因为它们在内容上线并触及读者之前可用,以便早期检测。我们在大量标记的网站快照上证明了我们的方法的可行性。我们还提出了初步的实时部署的结果,成功地发现了虚假信息网站,同时突出了自动虚假信息检测的未开发挑战。
Platforms have struggled to keep pace with the spread of disinformation. Current responses like user reports, manual analysis, and third-party fact checking are slow and difficult to scale, and as a result, disinformation can spread unchecked for some time after being created. Automation is essential for enabling platforms to respond rapidly to disinformation. In this work, we explore a new direction for automated detection of disinformation websites: infrastructure features. Our hypothesis is that while disinformation websites may be perceptually similar to authentic news websites, there may also be significant non-perceptual differences in the domain registrations, TLS/SSL certificates, and web hosting configurations. Infrastructure features are particularly valuable for detecting disinformation websites because they are available before content goes live and reaches readers, enabling early detection. We demonstrate the feasibility of our approach on a large corpus of labeled website snapshots. We also present results from a preliminary real-time deployment, successfully discovering disinformation websites while highlighting unexplored challenges for automated disinformation detection.