论文标题
通过在时域进行下采样的隐私放大
Privacy Amplification by Subsampling in Time Domain
论文作者
论文摘要
诸如交通流量和现场占用率之类的汇总时间序列数据反复从人群中进行统计数据。此类数据对于理解给定人群中的趋势非常有用,但也构成了很大的隐私风险,可能揭示了谁花时间在哪里。由于单个参与者对序列的巨大影响,生产满足差异隐私(DP)标准定义(DP)的时间序列的私有版本是具有挑战性的:如果一个人可以为每个时间步有贡献,那么满足隐私的添加噪声所需的添加噪声量会随着时间的时间步骤的数量而线性地增加。因此,如果信号跨越持续时间或过采样,则必须添加过多的噪声,从而淹没了潜在的趋势。但是,在许多应用程序中,个人实际上不能在每个时间步骤中参与。在这种情况下,我们观察到单个参与者的影响(灵敏度)可以通过亚采样和/或过滤及时降低,同时仍然满足隐私要求。使用新的分析,我们显示了灵敏度的显着降低,并提出了相应的隐私机制。我们通过现实世界和合成时间序列数据在经验上证明了这些技术的实用性好处。
Aggregate time-series data like traffic flow and site occupancy repeatedly sample statistics from a population across time. Such data can be profoundly useful for understanding trends within a given population, but also pose a significant privacy risk, potentially revealing e.g., who spends time where. Producing a private version of a time-series satisfying the standard definition of Differential Privacy (DP) is challenging due to the large influence a single participant can have on the sequence: if an individual can contribute to each time step, the amount of additive noise needed to satisfy privacy increases linearly with the number of time steps sampled. As such, if a signal spans a long duration or is oversampled, an excessive amount of noise must be added, drowning out underlying trends. However, in many applications an individual realistically cannot participate at every time step. When this is the case, we observe that the influence of a single participant (sensitivity) can be reduced by subsampling and/or filtering in time, while still meeting privacy requirements. Using a novel analysis, we show this significant reduction in sensitivity and propose a corresponding class of privacy mechanisms. We demonstrate the utility benefits of these techniques empirically with real-world and synthetic time-series data.