论文标题
一种混合防御方法,反对对自动驾驶汽车交通标志分类器的对抗攻击
A Hybrid Defense Method against Adversarial Attacks on Traffic Sign Classifiers in Autonomous Vehicles
论文作者
论文摘要
对抗性攻击可以使深度神经网络(DNN)模型预测自动驾驶汽车(AV)感知模块的不正确输出标签,例如错误分类的交通标志。针对对抗性攻击的弹性可以通过避免迹象或物体的错误分类来帮助AVS安全地在道路上行驶。这项基于DNN的研究为使用混合防御方法的AVS开发了一种弹性的交通标志分类器。我们使用转移学习将Inception-V3和Resnet-152模型作为交通标志分类器。该方法还利用了三种不同策略的组合:随机过滤,结合和本地特征映射。我们将随机的裁剪和调整技术用于随机过滤,多个投票作为结合策略,而光学特征识别模型则作为本地特征映射器。这种基于DNN的混合防御方法已针对NO攻击方案进行了测试,并针对众所周知的非目标对抗攻击(例如,预计的梯度下降或PGD,快速梯度符号方法或FGSM,动量迭代方法或MIM攻击,以及Carlini和Carlini和Wagner或Wagner或C&W)。我们发现,我们的混合防御方法可在无攻击方案中达到99%的平均流量标志分类精度,所有攻击方案的平均流量标志分类精度为88%。此外,与传统的防御方法(即JPEG过滤,特征挤压,二进制过滤和随机过滤)相比,本研究中提出的混合防御方法提高了交通标志分类的准确性,高达6%,50%和55%的FGSM,MIM和PGD攻击。
Adversarial attacks can make deep neural network (DNN) models predict incorrect output labels, such as misclassified traffic signs, for autonomous vehicle (AV) perception modules. Resilience against adversarial attacks can help AVs navigate safely on the road by avoiding misclassication of signs or objects. This DNN-based study develops a resilient traffic sign classifier for AVs that uses a hybrid defense method. We use transfer learning to retrain the Inception-V3 and Resnet-152 models as traffic sign classifiers. This method also utilizes a combination of three different strategies: random filtering, ensembling, and local feature mapping. We use the random cropping and resizing technique for random filtering, plurality voting as ensembling strategy and an optical character recognition model as a local feature mapper. This DNN-based hybrid defense method has been tested for the no attack scenario and against well-known untargeted adversarial attacks (e.g., Projected Gradient Descent or PGD, Fast Gradient Sign Method or FGSM, Momentum Iterative Method or MIM attack, and Carlini and Wagner or C&W). We find that our hybrid defense method achieves 99% average traffic sign classification accuracy for the no attack scenario and 88% average traffic sign classification accuracy for all attack scenarios. Moreover, the hybrid defense method, presented in this study, improves the accuracy for traffic sign classification compared to the traditional defense methods (i.e., JPEG filtering, feature squeezing, binary filtering, and random filtering) up to 6%, 50%, and 55% for FGSM, MIM, and PGD attacks, respectively.