针对多模式编码器的数据中毒攻击

论文标题

针对多模式编码器的数据中毒攻击

Data Poisoning Attacks Against Multimodal Encoders

论文作者

Yang, Ziqing, He, Xinlei, Li, Zheng, Backes, Michael, Humbert, Mathias, Berrang, Pascal, Zhang, Yang

论文摘要

最近，新出现的多模型模型利用视觉和语言模式来训练强大的编码器，引起了人们的关注。但是，从大规模的未标记数据集中学习也将模型暴露于潜在的中毒攻击的风险中，从而使对手旨在将模型的训练数据驱逐出境，以触发其中的恶意行为。与以前的工作相反，只有中毒的视觉方式，在这项工作中，我们迈出了第一步，以研究视觉和语言方式中针对多模式模型的中毒攻击。特别是，我们专注于回答两个问题：（1）语言方式也容易受到中毒攻击的影响吗？（2）哪种方式最脆弱？为了回答这两个问题，我们提出了三种针对多模型模型的中毒攻击。对不同数据集和模型架构的广泛评估表明，所有三个攻击都可以实现重要的攻击性能，同时保持视觉和语言方式的模型实用性。此外，我们观察到中毒效应在不同方式之间有所不同。为了减轻攻击，我们提出了训练前和训练后的防御能力。我们从经验上表明，两种防御能力都可以显着降低攻击性能，同时保留模型的效用。

Recently, the newly emerged multimodal models, which leverage both visual and linguistic modalities to train powerful encoders, have gained increasing attention. However, learning from a large-scale unlabeled dataset also exposes the model to the risk of potential poisoning attacks, whereby the adversary aims to perturb the model's training data to trigger malicious behaviors in it. In contrast to previous work, only poisoning visual modality, in this work, we take the first step to studying poisoning attacks against multimodal models in both visual and linguistic modalities. Specially, we focus on answering two questions: (1) Is the linguistic modality also vulnerable to poisoning attacks? and (2) Which modality is most vulnerable? To answer the two questions, we propose three types of poisoning attacks against multimodal models. Extensive evaluations on different datasets and model architectures show that all three attacks can achieve significant attack performance while maintaining model utility in both visual and linguistic modalities. Furthermore, we observe that the poisoning effect differs between different modalities. To mitigate the attacks, we propose both pre-training and post-training defenses. We empirically show that both defenses can significantly reduce the attack performance while preserving the model's utility.

下载PDF全文

下载文献需遵守相关版权规定

论文标题