论文标题

当机器无法危害隐私时

When Machine Unlearning Jeopardizes Privacy

论文作者

Chen, Min, Zhang, Zhikun, Wang, Tianhao, Backes, Michael, Humbert, Mathias, Zhang, Yang

论文摘要

被遗忘的权利指出,数据所有者有权从存储的实体中删除其数据。在机器学习(ML)的上下文中,被遗忘的权利要求ML模型所有者从用于构建ML模型的训练集中删除数据所有者的数据,该过程称为“机器”学习。虽然最初是为了保护数据所有者的隐私而设计的,但我们认为机器无法在ML模型中留下一些数据烙印,从而造成意外的隐私风险。在本文中,我们进行了第一项研究,以调查机器未学习引起的意外信息泄漏。我们提出了一种新型的成员推理攻击,该攻击利用ML模型的两个版本的不同输出来推断目标样本是否是原始模型的训练集的一部分,但源于相应未经学习的模型的训练集。我们的实验表明,拟议的成员推理攻击实现了强劲的表现。更重要的是,我们表明,我们在多种情况下的攻击优于原始ML模型的经典成员推理攻击,这表明机器未学习可以对隐私产生适得其反的影响。我们注意到,对于经典会员推断不佳的良好的ML模型,隐私退化尤其重要。我们进一步研究了四种减轻新发现的隐私风险的机制,并表明仅释放预测标签,温度缩放和差异隐私是有效的。我们认为,我们的结果可以帮助改善机器学习的实际实施中的隐私保护。我们的代码可从https://github.com/minchen00/unlearnewleaks获得。

The right to be forgotten states that a data owner has the right to erase their data from an entity storing it. In the context of machine learning (ML), the right to be forgotten requires an ML model owner to remove the data owner's data from the training set used to build the ML model, a process known as machine unlearning. While originally designed to protect the privacy of the data owner, we argue that machine unlearning may leave some imprint of the data in the ML model and thus create unintended privacy risks. In this paper, we perform the first study on investigating the unintended information leakage caused by machine unlearning. We propose a novel membership inference attack that leverages the different outputs of an ML model's two versions to infer whether a target sample is part of the training set of the original model but out of the training set of the corresponding unlearned model. Our experiments demonstrate that the proposed membership inference attack achieves strong performance. More importantly, we show that our attack in multiple cases outperforms the classical membership inference attack on the original ML model, which indicates that machine unlearning can have counterproductive effects on privacy. We notice that the privacy degradation is especially significant for well-generalized ML models where classical membership inference does not perform well. We further investigate four mechanisms to mitigate the newly discovered privacy risks and show that releasing the predicted label only, temperature scaling, and differential privacy are effective. We believe that our results can help improve privacy protection in practical implementations of machine unlearning. Our code is available at https://github.com/MinChen00/UnlearningLeaks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源