论文标题
从模型梯度重建培训数据,事实证明
Reconstructing Training Data from Model Gradient, Provably
论文作者
论文摘要
了解模型梯度何时以及多少泄漏有关培训样本的信息是隐私的重要问题。在本文中,我们提出了一个令人惊讶的结果:即使没有训练或记住数据,我们也可以以随机选择的参数值从单个梯度查询中完全重建训练样本。我们证明了在轻度条件下训练数据的可识别性:具有浅或深神经网络以及广泛的激活功能。我们还基于张量分解以重建训练数据,提出了一种统计和计算有效的算法。作为揭示敏感培训数据的可证明的攻击,我们的发现表明了潜在的严重威胁隐私威胁,尤其是在联邦学习中。
Understanding when and how much a model gradient leaks information about the training sample is an important question in privacy. In this paper, we present a surprising result: even without training or memorizing the data, we can fully reconstruct the training samples from a single gradient query at a randomly chosen parameter value. We prove the identifiability of the training data under mild conditions: with shallow or deep neural networks and a wide range of activation functions. We also present a statistically and computationally efficient algorithm based on tensor decomposition to reconstruct the training data. As a provable attack that reveals sensitive training data, our findings suggest potential severe threats to privacy, especially in federated learning.