论文标题
基于最长的圆形共屈服的局部敏感散列方案
Locality-Sensitive Hashing Scheme based on Longest Circular Co-Substring
论文作者
论文摘要
对局部敏感的哈希(LSH)是$ c $ approximate最近的邻居搜索($ c $ -anns)在高维空间中的最受欢迎的方法之一。在本文中,我们提出了一种基于最长的圆形共串联(LCC)搜索框架(LCCS-LSH)的新型LSH方案,并具有理论保证。我们介绍了一个新颖的LCC概念和一个新的数据结构,称为$ k $ -lccs搜索,名为Cightular Shift Array(CSA)。 LCCS搜索框架的见解是,关闭数据对象的LCC比具有很高概率的遥远的LCC更长。 lccs-lsh是\ emph {lsh-family-Intepentent},它支持具有不同距离指标的$ c $ -anns。我们还引入了LCCS-LSH的多探针版本,并在五个现实生活中进行了广泛的实验。实验结果表明,LCCS-LSH优于最先进的LSH方案。
Locality-Sensitive Hashing (LSH) is one of the most popular methods for $c$-Approximate Nearest Neighbor Search ($c$-ANNS) in high-dimensional spaces. In this paper, we propose a novel LSH scheme based on the Longest Circular Co-Substring (LCCS) search framework (LCCS-LSH) with a theoretical guarantee. We introduce a novel concept of LCCS and a new data structure named Circular Shift Array (CSA) for $k$-LCCS search. The insight of LCCS search framework is that close data objects will have a longer LCCS than the far-apart ones with high probability. LCCS-LSH is \emph{LSH-family-independent}, and it supports $c$-ANNS with different kinds of distance metrics. We also introduce a multi-probe version of LCCS-LSH and conduct extensive experiments over five real-life datasets. The experimental results demonstrate that LCCS-LSH outperforms state-of-the-art LSH schemes.