论文标题
来自私有数据的贝叶斯推论的数据增强MCMC
Data Augmentation MCMC for Bayesian Inference from Privatized Data
论文作者
论文摘要
通过在数据中引入额外的随机性来保护私人机制来保护隐私。仅限制对私有化数据的访问使得对机密数据基础参数执行有效的统计推断变得具有挑战性。具体而言,私有化数据的似然函数需要在机密数据库的大空间上集成,并且通常是棘手的。对于贝叶斯分析,这会导致后验分布双重棘手,从而使传统的MCMC技术不适用。我们提出了一个MCMC框架,以从私有化数据中执行贝叶斯推断,该数据适用于广泛的统计模型和隐私机制。我们的MCMC算法使用未观察到的机密数据增强了模型参数,并在每个条件上彼此更新。对于更新机密数据的潜在挑战性步骤,我们提出了一种通用方法,以利用该机制的隐私保证以确保效率。我们给出了MCMC的计算复杂性,接受率和混合特性的结果。我们说明了我们方法在幼稚的bayes log线性模型以及线性回归模型上的功效和适用性。
Differentially private mechanisms protect privacy by introducing additional randomness into the data. Restricting access to only the privatized data makes it challenging to perform valid statistical inference on parameters underlying the confidential data. Specifically, the likelihood function of the privatized data requires integrating over the large space of confidential databases and is typically intractable. For Bayesian analysis, this results in a posterior distribution that is doubly intractable, rendering traditional MCMC techniques inapplicable. We propose an MCMC framework to perform Bayesian inference from the privatized data, which is applicable to a wide range of statistical models and privacy mechanisms. Our MCMC algorithm augments the model parameters with the unobserved confidential data, and alternately updates each one conditional on the other. For the potentially challenging step of updating the confidential data, we propose a generic approach that exploits the privacy guarantee of the mechanism to ensure efficiency. We give results on the computational complexity, acceptance rate, and mixing properties of our MCMC. We illustrate the efficacy and applicability of our methods on a naïve-Bayes log-linear model as well as on a linear regression model.