论文标题
语义私人信息检索
Semantic Private Information Retrieval
论文作者
论文摘要
我们研究语义私人信息检索的问题(语义PIR)。在语义PIR中,用户在$ n $复制的$ n $中存储的$ k $独立消息的消息检索,而无需向任何单个数据库揭示所需消息的身份。这些消息带有\ emph {不同的语义},即,允许这些消息具有\ emph {non-Cristion a a先生概率},由$(p_i> 0,\:i \:i \ in [k])$表示,它们对它们各自的检索均具有proxy fortival of torky purrive fortival fortival of fortival of fortival of fortival of fortival of fortival of fortival of fortival of fortival of fortival of fortive fortival of usph and \ emph emph sys says says of( [k])$。这是对经典私人信息检索(PIR)问题的概括,其中假定消息具有相等的先验概率和相等的消息大小。我们得出了一般$ k $,$ n $的语义PIR能力。结果表明,语义PIR容量取决于数据库$ n $的数量,消息$ k $的数量,消息的先验概率分布$ p_i $以及消息大小$ l_i $。我们提出了两个可实现的语义PIR方案:第一个是基于信息不对称的确定性方案。该方案采用不均匀的子包装。第二个方案是概率的,基于随机选择一个从多个选项设置的查询来检索所需的消息,而无需指数子包装。我们为语义PIR的能力提供了必要和足够的条件,以超过相等的先验和大小的经典PIR能力。我们的结果表明,当较长的消息具有更高的流行时,语义PIR的容量可能大于经典的PIR容量。但是,当消息相等时,不能利用非均匀的先验来提高检索速率,而不是经典的PIR容量。
We investigate the problem of semantic private information retrieval (semantic PIR). In semantic PIR, a user retrieves a message out of $K$ independent messages stored in $N$ replicated and non-colluding databases without revealing the identity of the desired message to any individual database. The messages come with \emph{different semantics}, i.e., the messages are allowed to have \emph{non-uniform a priori probabilities} denoted by $(p_i>0,\: i \in [K])$, which are a proxy for their respective popularity of retrieval, and \emph{arbitrary message sizes} $(L_i,\: i \in [K])$. This is a generalization of the classical private information retrieval (PIR) problem, where messages are assumed to have equal a priori probabilities and equal message sizes. We derive the semantic PIR capacity for general $K$, $N$. The results show that the semantic PIR capacity depends on the number of databases $N$, the number of messages $K$, the a priori probability distribution of messages $p_i$, and the message sizes $L_i$. We present two achievable semantic PIR schemes: The first one is a deterministic scheme which is based on message asymmetry. This scheme employs non-uniform subpacketization. The second scheme is probabilistic and is based on choosing one query set out of multiple options at random to retrieve the required message without the need for exponential subpacketization. We derive necessary and sufficient conditions for the semantic PIR capacity to exceed the classical PIR capacity with equal priors and sizes. Our results show that the semantic PIR capacity can be larger than the classical PIR capacity when longer messages have higher popularities. However, when messages are equal-length, the non-uniform priors cannot be exploited to improve the retrieval rate over the classical PIR capacity.