论文标题
Mead:一种评估对抗性示例检测器的多武器方法
MEAD: A Multi-Armed Approach for Evaluation of Adversarial Examples Detectors
论文作者
论文摘要
对抗性示例的检测在过去几年中一直是一个热门话题,因为它对于在关键应用程序中安全部署机器学习算法的重要性。但是,通常通过假设一种隐式已知的攻击策略来验证检测方法,这不一定要考虑现实生活中的威胁。确实,这可能导致对检测器性能的过度评估,并可能在竞争检测方案之间的比较中引起一些偏见。我们提出了一个新型的多武器框架,称为Mead,用于根据几种攻击策略来评估探测器,以克服这一限制。其中,我们利用三个新目标来产生攻击。提出的性能指标基于最坏的情况:当正确识别所有不同的攻击时,检测是成功的。从经验上讲,我们表明了方法的有效性。此外,最先进的探测器获得的性能不佳,开辟了一项新的令人兴奋的研究。
Detection of adversarial examples has been a hot topic in the last years due to its importance for safely deploying machine learning algorithms in critical applications. However, the detection methods are generally validated by assuming a single implicitly known attack strategy, which does not necessarily account for real-life threats. Indeed, this can lead to an overoptimistic assessment of the detectors' performance and may induce some bias in the comparison between competing detection schemes. We propose a novel multi-armed framework, called MEAD, for evaluating detectors based on several attack strategies to overcome this limitation. Among them, we make use of three new objectives to generate attacks. The proposed performance metric is based on the worst-case scenario: detection is successful if and only if all different attacks are correctly recognized. Empirically, we show the effectiveness of our approach. Moreover, the poor performance obtained for state-of-the-art detectors opens a new exciting line of research.