论文标题
通过专业分类器的合奏来检测新颖的社交机器人
Detection of Novel Social Bots by Ensembles of Specialized Classifiers
论文作者
论文摘要
恶意参与者创建了不真实的社交媒体帐户,部分由算法(称为社交机器人)控制,以传播错误信息并激发在线讨论。尽管研究人员开发了检测滥用的复杂方法,但具有不同行为的新机器人逃避检测。我们表明,不同类型的机器人的特征是不同的行为特征。结果,当试图检测训练数据中未观察到的行为时,监督的学习技术遭受了严重的性能恶化。此外,对这些模型进行调整以识别新型机器人,需要用大量的新注释进行重新训练,这是昂贵的。为了解决这些问题,我们提出了一种新的监督学习方法,该方法训练专门针对每个机器人的分类器,并通过最大规则结合他们的决策。专业分类器(ESC)的合奏可以更好地概括,从而使整个数据集的未见帐户的平均F1分数在F1分数中的平均提高。此外,在重新培训期间,以更少的标记示例学习了新型的机器人行为。我们将ESC部署在最新版本的BOTOMETER中,这是一种流行的工具,可检测野外社交机器人,其交叉验证AUC为0.99。
Malicious actors create inauthentic social media accounts controlled in part by algorithms, known as social bots, to disseminate misinformation and agitate online discussion. While researchers have developed sophisticated methods to detect abuse, novel bots with diverse behaviors evade detection. We show that different types of bots are characterized by different behavioral features. As a result, supervised learning techniques suffer severe performance deterioration when attempting to detect behaviors not observed in the training data. Moreover, tuning these models to recognize novel bots requires retraining with a significant amount of new annotations, which are expensive to obtain. To address these issues, we propose a new supervised learning method that trains classifiers specialized for each class of bots and combines their decisions through the maximum rule. The ensemble of specialized classifiers (ESC) can better generalize, leading to an average improvement of 56\% in F1 score for unseen accounts across datasets. Furthermore, novel bot behaviors are learned with fewer labeled examples during retraining. We deployed ESC in the newest version of Botometer, a popular tool to detect social bots in the wild, with a cross-validation AUC of 0.99.