论文标题
仅标签暴露中的会员资格泄漏
Membership Leakage in Label-Only Exposures
论文作者
论文摘要
机器学习(ML)已在各种关键的隐私应用中广泛采用,例如面部识别和医学图像分析。但是,最近的研究表明,ML模型容易受到针对其培训数据的攻击。会员推理是该领域中的一项主要攻击:给定数据样本和模型,对手旨在确定样本是否是模型训练集的一部分。现有的成员推理攻击利用模型返回的置信分数作为其输入(基于得分的攻击)。但是,如果模型仅揭示预测标签,即最终模型决策,则可以轻松地减轻这些攻击。 在本文中,我们提出了基于决策的会员推理攻击,并证明仅标签的暴露也容易受到会员泄漏的影响。特别是,我们开发了两种基于决策的攻击,即转移攻击和边界攻击。经验评估表明,我们基于决策的攻击可以实现出色的绩效,在某些情况下甚至优于先前基于得分的攻击。我们进一步介绍了基于定量和定性分析的成员推理成功推断的新见解,即模型的成员样本比非成员样本更远离模型的决策边界。最后,我们评估了针对基于决策的攻击的多种防御机制,并表明我们的两种攻击可以绕过大多数此类防御。
Machine learning (ML) has been widely adopted in various privacy-critical applications, e.g., face recognition and medical image analysis. However, recent research has shown that ML models are vulnerable to attacks against their training data. Membership inference is one major attack in this domain: Given a data sample and model, an adversary aims to determine whether the sample is part of the model's training set. Existing membership inference attacks leverage the confidence scores returned by the model as their inputs (score-based attacks). However, these attacks can be easily mitigated if the model only exposes the predicted label, i.e., the final model decision. In this paper, we propose decision-based membership inference attacks and demonstrate that label-only exposures are also vulnerable to membership leakage. In particular, we develop two types of decision-based attacks, namely transfer attack, and boundary attack. Empirical evaluation shows that our decision-based attacks can achieve remarkable performance, and even outperform the previous score-based attacks in some cases. We further present new insights on the success of membership inference based on quantitative and qualitative analysis, i.e., member samples of a model are more distant to the model's decision boundary than non-member samples. Finally, we evaluate multiple defense mechanisms against our decision-based attacks and show that our two types of attacks can bypass most of these defenses.