论文标题
具有信息理论保护抗会员推理攻击的生成模型
Generative Models with Information-Theoretic Protection Against Membership Inference Attacks
论文作者
论文摘要
通过估计高维数据的基本分布,深层生成模型(例如生成对抗网络(GAN))综合了不同的高保真数据样本。尽管他们取得了成功,但GAN可以从培训的数据中披露私人信息,使它们容易受到对抗性攻击(例如会员推理攻击)的影响,在这种攻击中,对手的目标是确定记录是否是培训集的一部分。我们提出了一个理论上动机的正规化项,以防止生成模型过度拟合到训练数据并鼓励可推广性。我们表明,这种罚款最大程度地减少了对具有不同成员资格的数据培训的生成器组件之间的Jensenshannon差异,并且可以使用其他分类器以低成本实施。我们在图像数据集上的实验表明,借助提出的正则化(只有较小的计算成本),甘斯能够保留隐私并生成高质量的样本,与非私人和差异化私有生成模型相比,可以实现更好的下游分类性能。
Deep generative models, such as Generative Adversarial Networks (GANs), synthesize diverse high-fidelity data samples by estimating the underlying distribution of high dimensional data. Despite their success, GANs may disclose private information from the data they are trained on, making them susceptible to adversarial attacks such as membership inference attacks, in which an adversary aims to determine if a record was part of the training set. We propose an information theoretically motivated regularization term that prevents the generative model from overfitting to training data and encourages generalizability. We show that this penalty minimizes the JensenShannon divergence between components of the generator trained on data with different membership, and that it can be implemented at low cost using an additional classifier. Our experiments on image datasets demonstrate that with the proposed regularization, which comes at only a small added computational cost, GANs are able to preserve privacy and generate high-quality samples that achieve better downstream classification performance compared to non-private and differentially private generative models.