没有免费的午餐“隐私免费：数据集凝结如何帮助隐私”

论文标题

没有免费的午餐“隐私免费：数据集凝结如何帮助隐私”

No Free Lunch in "Privacy for Free: How does Dataset Condensation Help Privacy"

论文作者

Carlini, Nicholas, Feldman, Vitaly, Nasr, Milad

论文摘要

旨在保留数据隐私的新方法需要仔细审查。无法检测到无法保存隐私，但是当攻击实施``隐私保护''方法的系统时，可能会导致灾难性结果。最近在ICML 2022（Dong等，2022）上选择了一项杰出纸张奖的一项工作声称，在培训机器学习模型时，数据集凝结（DC）可显着提高数据隐私。对特定数据集凝结技术的理论分析以及对某些现有成员推理攻击的抵抗力的经验评估，支持了这一主张。在本说明中，我们研究了Dong等人的工作中的主张。（2022）并描述了该方法的经验评估及其理论分析中的主要缺陷。这些缺陷表明他们的工作没有提供统计学上的显着证据，表明直流可以改善培训ML模型在天真基线上的隐私。此外，先前发表的结果表明，DP-SGD（保留ML的标准方法）同时提供了更好的准确性，并实现了（证明是）较低的会员攻击成功率。

New methods designed to preserve data privacy require careful scrutiny. Failure to preserve privacy is hard to detect, and yet can lead to catastrophic results when a system implementing a ``privacy-preserving'' method is attacked. A recent work selected for an Outstanding Paper Award at ICML 2022 (Dong et al., 2022) claims that dataset condensation (DC) significantly improves data privacy when training machine learning models. This claim is supported by theoretical analysis of a specific dataset condensation technique and an empirical evaluation of resistance to some existing membership inference attacks. In this note we examine the claims in the work of Dong et al. (2022) and describe major flaws in the empirical evaluation of the method and its theoretical analysis. These flaws imply that their work does not provide statistically significant evidence that DC improves the privacy of training ML models over a naive baseline. Moreover, previously published results show that DP-SGD, the standard approach to privacy preserving ML, simultaneously gives better accuracy and achieves a (provably) lower membership attack success rate.

下载PDF全文

下载文献需遵守相关版权规定

论文标题