论文标题
在当地差异隐私下回答多维范围查询
Answering Multi-Dimensional Range Queries under Local Differential Privacy
论文作者
论文摘要
在本文中,我们解决了在当地差异隐私下回答多维范围查询的问题。有三个关键的技术挑战:捕获属性之间的相关性,避免维度的诅咒,并处理属性的大域。现有的方法都没有令人满意地应对所有三个挑战。克服这三个挑战,我们首先提出了一种称为二维网格(TDG)的方法。它的主要思想是仔细使用binning将所有属性对的二维(2-D)域划分为2-D网格,这些网格可以回答所有2-D范围的查询,然后从相关的2-D范围查询的答案中估算更高维范围查询的答案。但是,为了减少由于噪声而引起的错误,每个属性都需要二维网格中的每个属性粗糙的粒度,从而丢失了单个属性的细粒分布信息。为了纠正这种缺乏症,我们进一步提出了混合维网格(HDG),该网格还引入了1-D网格,以捕获有关每个单个属性分布的细粒度信息,并结合了来自1-D和2-D网格的信息以回答范围查询。为了使HDG持续有效,我们根据分析这些选择如何影响不同的错误来源,为正确选择网格的粒度提供了指南。对真实和合成数据集进行的广泛实验表明,HDG可以对现有方法产生重大改进。
In this paper, we tackle the problem of answering multi-dimensional range queries under local differential privacy. There are three key technical challenges: capturing the correlations among attributes, avoiding the curse of dimensionality, and dealing with the large domains of attributes. None of the existing approaches satisfactorily deals with all three challenges. Overcoming these three challenges, we first propose an approach called Two-Dimensional Grids (TDG). Its main idea is to carefully use binning to partition the two-dimensional (2-D) domains of all attribute pairs into 2-D grids that can answer all 2-D range queries and then estimate the answer of a higher dimensional range query from the answers of the associated 2-D range queries. However, in order to reduce errors due to noises, coarse granularities are needed for each attribute in 2-D grids, losing fine-grained distribution information for individual attributes. To correct this deficiency, we further propose Hybrid-Dimensional Grids (HDG), which also introduces 1-D grids to capture finer-grained information on distribution of each individual attribute and combines information from 1-D and 2-D grids to answer range queries. To make HDG consistently effective, we provide a guideline for properly choosing granularities of grids based on an analysis of how different sources of errors are impacted by these choices. Extensive experiments conducted on real and synthetic datasets show that HDG can give a significant improvement over the existing approaches.