在当地差异隐私下对离散分布的强大估计

论文标题

在当地差异隐私下对离散分布的强大估计

Robust Estimation of Discrete Distributions under Local Differential Privacy

论文作者

Chhor, Julien, Sentenac, Flore

论文摘要

尽管强大的学习和当地差异隐私都是广泛研究的研究领域，但是将这两种设置结合起来才刚刚开始探索。我们考虑了估计从$ n $污染的数据批估计当地差异隐私约束下的$ n $污染数据批次的问题的问题。批处理的分数$1-ε$包含$ k $ i.i.d。从$ d $元素上方的离散分发$ p $ p $绘制的样品。为了保护用户的隐私，每个样品都使用$ local-loclocly差异化的私人机制进行私有化。其余的$εn$批次是一种对抗性污染。仅在没有隐私的情况下，仅污染下的最小值估计率就可以是$ε/\ sqrt {k}+\ sqrt {d/kn} $，最多可达$ \ sqrt {\ log log（log log（1/Z）} $ ractor。仅在隐私约束下，最小值的估计率为$ \ sqrt {d^2/α^2 kn} $。我们表明，结合两个约束导致最小值估计率为$ε\ sqrt {d/α^2 k}+\ sqrt {d^2/α^2 kn} $，最高为$ \ sqrt {\ sqrt {\ log log（1/ε）} $，大于两个单独的速率。我们提供了一个多项式时算法，可实现该结合，以及匹配的信息理论下限。

Although robust learning and local differential privacy are both widely studied fields of research, combining the two settings is just starting to be explored. We consider the problem of estimating a discrete distribution in total variation from $n$ contaminated data batches under a local differential privacy constraint. A fraction $1-ε$ of the batches contain $k$ i.i.d. samples drawn from a discrete distribution $p$ over $d$ elements. To protect the users' privacy, each of the samples is privatized using an $α$-locally differentially private mechanism. The remaining $εn $ batches are an adversarial contamination. The minimax rate of estimation under contamination alone, with no privacy, is known to be $ε/\sqrt{k}+\sqrt{d/kn}$, up to a $\sqrt{\log(1/ε)}$ factor. Under the privacy constraint alone, the minimax rate of estimation is $\sqrt{d^2/α^2 kn}$. We show that combining the two constraints leads to a minimax estimation rate of $ε\sqrt{d/α^2 k}+\sqrt{d^2/α^2 kn}$ up to a $\sqrt{\log(1/ε)}$ factor, larger than the sum of the two separate rates. We provide a polynomial-time algorithm achieving this bound, as well as a matching information theoretic lower bound.

下载PDF全文

下载文献需遵守相关版权规定

论文标题