论文标题
本地私人假设选择
Locally Private Hypothesis Selection
论文作者
论文摘要
我们在当地差异隐私下启动了假设选择的研究。给定未知概率分布$ p $和一组$ k $概率分布的样品,我们的目标是在$ \ varepsilon $ - 局部差异隐私的约束下输出,从$ \ \ \ \ \ \ \ \ \ \ \ nathcal {q} $的总变化距离与$ p $相比,是与最佳分配相比。这是对$ k $ - 简单假设测试的经典问题的概括,它与\ nr \ Mathcal {q} $中的$ p \时相对应,我们希望识别$ p $。缺乏隐私限制,此问题需要$ o(\ log k)$从$ p $进行样本,最近显示在(中央)差异隐私下可以实现相同的复杂性。但是,在当地差异隐私下解决此问题的天真方法将需要$ \ tilde o(k^2)$样本。 我们首先表明,局部差异隐私的约束会导致成本呈指数增加:此问题的任何算法都需要至少$ω(k)$样本。其次,对于$ k $的特殊情况,简单的假设测试,我们提供了一种非相互作用算法,几乎与此界限相匹配,需要$ \ tilde o(k)$样本。最后,我们为一般情况提供了顺序的交互式算法,需要$ \ tilde o(k)$ samples,仅$ O(\ log \ log \ log k)$ rounds of交互性。我们的算法是通过使用对抗性比较器减少到最大选择的,这是我们在平行环境中启动研究的独立兴趣问题。对于这个问题,我们为每个允许的交互作用$ t $提供了一个算法系列,以及下界,表明它们每$ t $都几乎是最佳的。值得注意的是,我们的算法会对以前方法的圆形复杂性进行指数改善。
We initiate the study of hypothesis selection under local differential privacy. Given samples from an unknown probability distribution $p$ and a set of $k$ probability distributions $\mathcal{Q}$, we aim to output, under the constraints of $\varepsilon$-local differential privacy, a distribution from $\mathcal{Q}$ whose total variation distance to $p$ is comparable to the best such distribution. This is a generalization of the classic problem of $k$-wise simple hypothesis testing, which corresponds to when $p \in \mathcal{Q}$, and we wish to identify $p$. Absent privacy constraints, this problem requires $O(\log k)$ samples from $p$, and it was recently shown that the same complexity is achievable under (central) differential privacy. However, the naive approach to this problem under local differential privacy would require $\tilde O(k^2)$ samples. We first show that the constraint of local differential privacy incurs an exponential increase in cost: any algorithm for this problem requires at least $Ω(k)$ samples. Second, for the special case of $k$-wise simple hypothesis testing, we provide a non-interactive algorithm which nearly matches this bound, requiring $\tilde O(k)$ samples. Finally, we provide sequentially interactive algorithms for the general case, requiring $\tilde O(k)$ samples and only $O(\log \log k)$ rounds of interactivity. Our algorithms are achieved through a reduction to maximum selection with adversarial comparators, a problem of independent interest for which we initiate study in the parallel setting. For this problem, we provide a family of algorithms for each number of allowed rounds of interaction $t$, as well as lower bounds showing that they are near-optimal for every $t$. Notably, our algorithms result in exponential improvements on the round complexity of previous methods.