通过贝叶斯优化对离散顺序数据的查询效率和可扩展的黑盒对抗攻击

论文标题

通过贝叶斯优化对离散顺序数据的查询效率和可扩展的黑盒对抗攻击

Query-Efficient and Scalable Black-Box Adversarial Attacks on Discrete Sequential Data via Bayesian Optimization

论文作者

Lee, Deokjae, Moon, Seungyong, Lee, Junhyeok, Song, Hyun Oh

论文摘要

我们专注于在黑框设置中对模型的对抗性攻击的问题，攻击者旨在制作对受害者模型的查询访问有限的对抗性示例。现有的Black-Box攻击主要基于贪婪的算法，使用预先计算的关键位置来扰动，这严重限制了搜索空间，并可能导致次优的解决方案。为此，我们使用贝叶斯优化提出了一种查询有效的黑盒攻击，该贝叶斯优化使用自动相关性确定（ARD）分类内核动态计算重要位置。我们介绍了块分解和历史二次采样技术，以提高输入序列长时间时贝叶斯优化的可扩展性。此外，我们开发了一种优化的算法，该算法找到了摄动尺寸较小的对抗示例。关于自然语言和蛋白质分类任务的实验表明，与先前的最新方法相比，我们的方法始终达到更高的攻击成功率，查询计数和修改率的显着降低。

We focus on the problem of adversarial attacks against models on discrete sequential data in the black-box setting where the attacker aims to craft adversarial examples with limited query access to the victim model. Existing black-box attacks, mostly based on greedy algorithms, find adversarial examples using pre-computed key positions to perturb, which severely limits the search space and might result in suboptimal solutions. To this end, we propose a query-efficient black-box attack using Bayesian optimization, which dynamically computes important positions using an automatic relevance determination (ARD) categorical kernel. We introduce block decomposition and history subsampling techniques to improve the scalability of Bayesian optimization when an input sequence becomes long. Moreover, we develop a post-optimization algorithm that finds adversarial examples with smaller perturbation size. Experiments on natural language and protein classification tasks demonstrate that our method consistently achieves higher attack success rate with significant reduction in query count and modification rate compared to the previous state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题