论文标题
通过降低维度近似文本对图案距离
Approximating Text-to-Pattern Distance via Dimensionality Reduction
论文作者
论文摘要
文本对图案距离是字符串匹配中的一个基本问题,在串联匹配中,在整数字母上,给定长度$ m $和长度$ n $的文本,我们被要求计算每个位置的模式和文本之间的距离。距离函数可以是例如某些参数$ p> 0 $的锤距离或$ \ ell_p $距离。在过去的$ \ sim 40美元中,几乎所有最新的精确算法和近似算法都使用FFT作为黑盒。在这项工作中,我们介绍$ \ widetilde {o}(n/\ varepsilon^2)$(1 \ pm \ pm \ varepsilon)$ - $ \ ell_2 $ distances的近似值(1 \ pm \ varepsilon)$ - 距离,无需使用FFT。这与Chan等人的最新发展无关。 [STOC 2020],其中提出了$ o(n/\ varepsilon^2)$不使用FFT的锤距离算法 - 尽管它们的算法更为“组合”,但我们的技术适用于其他规范。
Text-to-pattern distance is a fundamental problem in string matching, where given a pattern of length $m$ and a text of length $n$, over an integer alphabet, we are asked to compute the distance between pattern and the text at every location. The distance function can be e.g. Hamming distance or $\ell_p$ distance for some parameter $p > 0$. Almost all state-of-the-art exact and approximate algorithms developed in the past $\sim 40$ years were using FFT as a black-box. In this work we present $\widetilde{O}(n/\varepsilon^2)$ time algorithms for $(1\pm\varepsilon)$-approximation of $\ell_2$ distances, and $\widetilde{O}(n/\varepsilon^3)$ algorithm for approximation of Hamming and $\ell_1$ distances, all without use of FFT. This is independent to the very recent development by Chan et al. [STOC 2020], where $O(n/\varepsilon^2)$ algorithm for Hamming distances not using FFT was presented -- although their algorithm is much more "combinatorial", our techniques apply to other norms than Hamming.