论文标题
指导卷积神经网络的重新验证,以防止对抗性输入
Guiding the retraining of convolutional neural networks against adversarial inputs
论文作者
论文摘要
背景:当使用深度学习模型时,存在许多可能的漏洞,一些最令人担忧的是对抗性输入,这可能会导致错误的扰动决策。因此,作为解决这些输入脆弱性的软件测试过程的一部分,有必要针对对抗输入进行重新训练。此外,要进行节能测试和再培训,数据科学家需要支持,这是最佳指导指标和最佳数据集配置。 目的:我们检查了四个指导指标,用于重新卷积神经网络和三个重新培训配置。我们的目标是在图像分类的背景下,从数据科学家的角度来看,针对有关准确性,资源利用率和时间的对抗性输入的模型。 方法:我们在两个数据集中进行了一项实证研究,以进行图像分类。我们探索:(a)通过订购由四个不同的指导指标(神经元覆盖,基于神经元的惊喜足够充分性,基于距离的惊喜充足性和随机性和随机)设置的新培训来进行卷积神经网络的准确性,资源利用和时间,(b)使用三种不同的配置(blate croment)(b)使用三个不同的配置(b)进行汇总(B)并使用权重和仅对抗输入)。 结果:我们揭示了从原始权重的对抗性输入以及通过以惊喜充足度指标订购的重新训练提供了最佳模型W.R.T.使用的指标。 结论:尽管需要更多的研究,但我们建议数据科学家使用上述配置和指标来应对深度学习模型的对抗性输入的脆弱性,因为它们可以在不使用许多输入的情况下针对对抗性输入来改善模型。
Background: When using deep learning models, there are many possible vulnerabilities and some of the most worrying are the adversarial inputs, which can cause wrong decisions with minor perturbations. Therefore, it becomes necessary to retrain these models against adversarial inputs, as part of the software testing process addressing the vulnerability to these inputs. Furthermore, for an energy efficient testing and retraining, data scientists need support on which are the best guidance metrics and optimal dataset configurations. Aims: We examined four guidance metrics for retraining convolutional neural networks and three retraining configurations. Our goal is to improve the models against adversarial inputs regarding accuracy, resource utilization and time from the point of view of a data scientist in the context of image classification. Method: We conducted an empirical study in two datasets for image classification. We explore: (a) the accuracy, resource utilization and time of retraining convolutional neural networks by ordering new training set by four different guidance metrics (neuron coverage, likelihood-based surprise adequacy, distance-based surprise adequacy and random), (b) the accuracy and resource utilization of retraining convolutional neural networks with three different configurations (from scratch and augmented dataset, using weights and augmented dataset, and using weights and only adversarial inputs). Results: We reveal that retraining with adversarial inputs from original weights and by ordering with surprise adequacy metrics gives the best model w.r.t. the used metrics. Conclusions: Although more studies are necessary, we recommend data scientists to use the above configuration and metrics to deal with the vulnerability to adversarial inputs of deep learning models, as they can improve their models against adversarial inputs without using many inputs.