捍卫分布式分类器免受数据中毒攻击

论文标题

捍卫分布式分类器免受数据中毒攻击

Defending Distributed Classifiers Against Data Poisoning Attacks

论文作者

Weerasinghe, Sandamal, Alpcan, Tansu, Erfani, Sarah M., Leckie, Christopher

论文摘要

支持向量机（SVM）容易受到有针对性的培训数据操作的影响，例如中毒攻击和标签翻转。通过仔细操纵训练样本的一部分，攻击者迫使学习者计算错误的决策边界，从而导致错误分类。考虑到SVM在工程和至关重要的应用中的重要性提高，我们开发了一种新颖的防御算法，可以提高对这种攻击的抵抗力。局部内在维度（LID）是一个有希望的度量标准，它表征了数据样本的异常性。在这项工作中，我们引入了一种称为k-lid的盖子的新近似值，该盖子在盖子计算中使用核距离，该距离允许在高尺寸转换的空间中计算盖子。我们引入了使用K-LID的加权SVM，以此作为一种显着性的特征，该特征强调可疑数据样本对SVM决策边界的影响。每个样品的k-lid值来自良性k-lid分布而不是攻击的k-lid分布的加权。然后，我们演示了如何通过基于SDR的监视系统的案例研究将提出的防御应用于分布式SVM框架。使用基准数据集的实验表明，所提出的防御可降低分类错误率（平均为10％）。

Support Vector Machines (SVMs) are vulnerable to targeted training data manipulations such as poisoning attacks and label flips. By carefully manipulating a subset of training samples, the attacker forces the learner to compute an incorrect decision boundary, thereby cause misclassifications. Considering the increased importance of SVMs in engineering and life-critical applications, we develop a novel defense algorithm that improves resistance against such attacks. Local Intrinsic Dimensionality (LID) is a promising metric that characterizes the outlierness of data samples. In this work, we introduce a new approximation of LID called K-LID that uses kernel distance in the LID calculation, which allows LID to be calculated in high dimensional transformed spaces. We introduce a weighted SVM against such attacks using K-LID as a distinguishing characteristic that de-emphasizes the effect of suspicious data samples on the SVM decision boundary. Each sample is weighted on how likely its K-LID value is from the benign K-LID distribution rather than the attacked K-LID distribution. We then demonstrate how the proposed defense can be applied to a distributed SVM framework through a case study on an SDR-based surveillance system. Experiments with benchmark data sets show that the proposed defense reduces classification error rates substantially (10% on average).

下载PDF全文

下载文献需遵守相关版权规定

论文标题