论文标题

有界空间在私有分位数上

Bounded Space Differentially Private Quantiles

论文作者

Alabi, Daniel, Ben-Eliezer, Omri, Chaturvedi, Anamay

论文摘要

在流算法文献和差异隐私文献中,估计大数据集的分位数是一个基本问题。但是,所有现有的独立于分布的分位数计算的私人机制都需要在输入尺寸$ n $中至少线性空间。在这项工作中,我们为分位数估计问题设计了一种差异性私有算法,并具有强烈​​的均匀空间复杂性,在单发和持续的观察环境中。我们的基本机制估算了使用$ o \ left的概率$ 1-β$在数据宇宙$ \ Mathcal $ \ Mathcal {x} $上的任何$α$ - approxigation n $流的任何$α$ - $ n $流,使用$ o \ left(\ frac {\ frac {\ log {| \ nathcal {| \ mathcal {x}} |/β|/β) $ε$ -Differential隐私在一个时间点。我们的方法建立在确定性流算法的基础上,用于使用草图项目定义的实用程序函数实例化指数机制,而(私下)从草图定义的间隔进行采样。我们还基于直方图提出了另一种算法,该算法特别适合多个分位数。我们实施算法,并在合成和现实世界数据集上对其进行实验评估。

Estimating the quantiles of a large dataset is a fundamental problem in both the streaming algorithms literature and the differential privacy literature. However, all existing private mechanisms for distribution-independent quantile computation require space at least linear in the input size $n$. In this work, we devise a differentially private algorithm for the quantile estimation problem, with strongly sublinear space complexity, in the one-shot and continual observation settings. Our basic mechanism estimates any $α$-approximate quantile of a length-$n$ stream over a data universe $\mathcal{X}$ with probability $1-β$ using $O\left( \frac{\log (|\mathcal{X}|/β) \log (αεn)}{αε} \right)$ space while satisfying $ε$-differential privacy at a single time point. Our approach builds upon deterministic streaming algorithms for non-private quantile estimation instantiating the exponential mechanism using a utility function defined on sketch items, while (privately) sampling from intervals defined by the sketch. We also present another algorithm based on histograms that is especially suited to the multiple quantiles case. We implement our algorithms and experimentally evaluate them on synthetic and real-world datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源