论文标题
采样攻击:通过重复查询扩增会员推理攻击
Sampling Attacks: Amplification of Membership Inference Attacks by Repeated Queries
论文作者
论文摘要
机器学习模型已被证明泄漏信息违反了培训集的隐私。我们专注于对机器学习模型的会员推理攻击,旨在确定是否使用数据点来训练受害者模型。我们的工作包括两个方面:我们引入采样攻击,这是一种新型的会员推理技术,与其他标准会员对手不同,它可以在严重限制不访问受害者模型的情况下工作。我们表明,仅发布标签的受害者模型仍然容易受到采样攻击的影响,并且与后向量相比,对手可以恢复其性能的100%。我们工作的另一面包括有关最近的两个成员推理攻击模型及其防御能力的实验结果。为了进行防御,我们在培训受害者模型期间以梯度扰动的形式选择差异隐私以及预测时输出扰动。我们在广泛的数据集上进行实验,这使我们能够更好地分析对手,防御机制和数据集之间的相互作用。我们发现,我们提出的快速,易于实施的输出扰动技术为会员推理攻击提供了良好的隐私保护,对实用程序几乎没有影响。
Machine learning models have been shown to leak information violating the privacy of their training set. We focus on membership inference attacks on machine learning models which aim to determine whether a data point was used to train the victim model. Our work consists of two sides: We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model. We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance compared to when posterior vectors are provided. The other sides of our work includes experimental results on two recent membership inference attack models and the defenses against them. For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time. We carry out our experiments on a wide range of datasets which allows us to better analyze the interaction between adversaries, defense mechanism and datasets. We find out that our proposed fast and easy-to-implement output perturbation technique offers good privacy protection for membership inference attacks at little impact on utility.