论文标题
在集中式和局部差异隐私下连续发布数据流
Continuous Release of Data Streams under both Centralized and Local Differential Privacy
论文作者
论文摘要
在本文中,我们研究了发布满足差异隐私(DP)的真实价值数据流的问题。一个主要的挑战是,最大可能的价值可能很大。因此,有必要估算一个阈值,以便将其上方的数字截断以减少所有数据所需的噪声量。估计必须以私人方式根据数据进行。我们开发了一种使用指数机制具有质量功能的方法,该机制在保持低灵敏度的同时近似于效用目标。鉴于阈值,我们然后提出了一种新颖的在线分层方法和几种后处理技术。 在这些想法的基础上,我们将步骤的步骤正式化,用于私人发布流数据。我们的框架由三个组件组成:阈值优化器私有估计阈值,一个透视器,它在流中添加了校准的噪声,以及使用后处理来改善结果的更光滑。在我们的框架内,我们设计了一种满足DP的更严格设置的算法,称为本地DP(LDP)。据我们所知,这是出版流媒体数据的第一种自然地在算法。使用四个现实世界数据集,我们证明我们的机制在效用方面优于最先进的数量级6-10个数量级(通过回答随机范围查询的平均平方错误来衡量)。
In this paper, we study the problem of publishing a stream of real-valued data satisfying differential privacy (DP). One major challenge is that the maximal possible value can be quite large; thus it is necessary to estimate a threshold so that numbers above it are truncated to reduce the amount of noise that is required to all the data. The estimation must be done based on the data in a private fashion. We develop such a method that uses the Exponential Mechanism with a quality function that approximates well the utility goal while maintaining a low sensitivity. Given the threshold, we then propose a novel online hierarchical method and several post-processing techniques. Building on these ideas, we formalize the steps into a framework for private publishing of stream data. Our framework consists of three components: a threshold optimizer that privately estimates the threshold, a perturber that adds calibrated noises to the stream, and a smoother that improves the result using post-processing. Within our framework, we design an algorithm satisfying the more stringent setting of DP called local DP (LDP). To our knowledge, this is the first LDP algorithm for publishing streaming data. Using four real-world datasets, we demonstrate that our mechanism outperforms the state-of-the-art by a factor of 6-10 orders of magnitude in terms of utility (measured by the mean squared error of answering a random range query).