在Twitter上检测虚假信息检测的多层方法

论文标题

在Twitter上检测虚假信息检测的多层方法

A multi-layer approach to disinformation detection on Twitter

论文作者

Pierri, Francesco, Piccardi, Carlo, Ceri, Stefano

论文摘要

我们仅在Twitter上检查了其扩散机制，解决了与虚假信息与主流新闻有关的新闻文章的问题。与现有基于文本的方法相比，我们的技术本质上是简单的，因为它允许在新闻内容（例如语法，语法，样式）中发现多个复杂性。我们采用Twitter扩散网络的多层表示，并为每个层计算一组全局网络功能，这些特征量化了共享过程的不同方面。两个大型数据集的实验结果，对应于美国和意大利分别在美国和意大利共享的新闻级联，表明一个简单的逻辑回归模型能够在同时考虑分类任务中不同来源的政治来源时，能够对虚假信息与主流网络进行分类。我们还强调了似乎与国家无关的两个新闻领域的共享模式差异。我们认为，我们的基于网络的方法提供了有用的见解，这为系统的未来开发铺平了道路，以检测社交媒体上传播的误导性和有害信息。

We tackle the problem of classifying news articles pertaining to disinformation vs mainstream news by solely inspecting their diffusion mechanisms on Twitter. Our technique is inherently simple compared to existing text-based approaches, as it allows to by-pass the multiple levels of complexity which are found in news content (e.g. grammar, syntax, style). We employ a multi-layer representation of Twitter diffusion networks, and we compute for each layer a set of global network features which quantify different aspects of the sharing process. Experimental results with two large-scale datasets, corresponding to diffusion cascades of news shared respectively in the United States and Italy, show that a simple Logistic Regression model is able to classify disinformation vs mainstream networks with high accuracy (AUROC up to 94%), also when considering the political bias of different sources in the classification task. We also highlight differences in the sharing patterns of the two news domains which appear to be country-independent. We believe that our network-based approach provides useful insights which pave the way to the future development of a system to detect misleading and harmful information spreading on social media.

下载PDF全文

下载文献需遵守相关版权规定

论文标题