论文标题

在当地差异隐私下对离散分布的强大估计

Robust Estimation of Discrete Distributions under Local Differential Privacy

论文作者

Chhor, Julien, Sentenac, Flore

论文摘要

尽管强大的学习和当地差异隐私都是广泛研究的研究领域,但是将这两种设置结合起来才刚刚开始探索。我们考虑了估计从$ n $污染的数据批估计当地差异隐私约束下的$ n $污染数据批次的问题的问题。批处理的分数$1-ε$包含$ k $ i.i.d。从$ d $元素上方的离散分发$ p $ p $绘制的样品。为了保护用户的隐私,每个样品都使用$ local-loclocly差异化的私人机制进行私有化。其余的$εn$批次是一种对抗性污染。仅在没有隐私的情况下,仅污染下的最小值估计率就可以是$ε/\ sqrt {k}+\ sqrt {d/kn} $,最多可达$ \ sqrt {\ log log(log log(1/Z)} $ ractor。仅在隐私约束下,最小值的估计率为$ \ sqrt {d^2/α^2 kn} $。我们表明,结合两个约束导致最小值估计率为$ε\ sqrt {d/α^2 k}+\ sqrt {d^2/α^2 kn} $,最高为$ \ sqrt {\ sqrt {\ log log(1/ε)} $,大于两个单独的速率。我们提供了一个多项式时算法,可实现该结合,以及匹配的信息理论下限。

Although robust learning and local differential privacy are both widely studied fields of research, combining the two settings is just starting to be explored. We consider the problem of estimating a discrete distribution in total variation from $n$ contaminated data batches under a local differential privacy constraint. A fraction $1-ε$ of the batches contain $k$ i.i.d. samples drawn from a discrete distribution $p$ over $d$ elements. To protect the users' privacy, each of the samples is privatized using an $α$-locally differentially private mechanism. The remaining $εn $ batches are an adversarial contamination. The minimax rate of estimation under contamination alone, with no privacy, is known to be $ε/\sqrt{k}+\sqrt{d/kn}$, up to a $\sqrt{\log(1/ε)}$ factor. Under the privacy constraint alone, the minimax rate of estimation is $\sqrt{d^2/α^2 kn}$. We show that combining the two constraints leads to a minimax estimation rate of $ε\sqrt{d/α^2 k}+\sqrt{d^2/α^2 kn}$ up to a $\sqrt{\log(1/ε)}$ factor, larger than the sum of the two separate rates. We provide a polynomial-time algorithm achieving this bound, as well as a matching information theoretic lower bound.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源