论文标题

线性回归的差异私有多方数据发布

Differentially Private Multi-Party Data Release for Linear Regression

论文作者

Wu, Ruihan, Yang, Xin, Yao, Yuanshun, Sun, Jiankai, Liu, Tianyi, Weinberger, Kilian Q., Wang, Chong

论文摘要

差异化私有(DP)数据发布是一种有前途的技术,可以在不损害数据主体隐私的情况下传播数据。但是,大多数先前的工作都集中在单一方拥有所有数据的场景上。在本文中,我们专注于多方设置,其中不同的利益相关者拥有属于同一数据主体的属性集合。在线性回归的上下文中,允许各方在完全数据上训练模型,而无需推断个人的私人属性或身份,我们首先直接应用高斯机制并表明它具有小的特征值问题。我们进一步提出了我们的新方法,并证明其渐近地收敛到随着数据集大小增加的最佳(非私有)解决方案。我们通过对人工和现实世界数据集的实验来证实理论结果。

Differentially Private (DP) data release is a promising technique to disseminate data without compromising the privacy of data subjects. However the majority of prior work has focused on scenarios where a single party owns all the data. In this paper we focus on the multi-party setting, where different stakeholders own disjoint sets of attributes belonging to the same group of data subjects. Within the context of linear regression that allow all parties to train models on the complete data without the ability to infer private attributes or identities of individuals, we start with directly applying Gaussian mechanism and show it has the small eigenvalue problem. We further propose our novel method and prove it asymptotically converges to the optimal (non-private) solutions with increasing dataset size. We substantiate the theoretical results through experiments on both artificial and real-world datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源