论文标题

持续的字符串和段选择中的随机访问

Random Access in Persistent Strings and Segment Selection

论文作者

Bille, Philip, Gørtz, Inge Li

论文摘要

我们考虑支持随机访问查询的类似字符串的集合表示。字符串的集合由根树给出,其中边缘通过编辑操作(插入,删除或替换字符)标记,而节点表示通过将编辑操作的顺序应用于root到节点的路径上获得的字符串。目的是紧凑地表示整个集合,同时支持快速随机访问集合中的字符串的任何部分。这个问题捕获了自然场景,例如代表编辑文档的过去历史或代表高度竞争的收藏。给定$ n $节点的树,我们展示了如何表示$ O(n)$ space和$ o(\ log n/ \ log \ log \ log n)$查询时间的相应集合。这改善了以前的时间空间权衡问题。此外,我们显示了一个下边界,证明查询时间对于使用近线性空间的任何解决方案都是最佳的。 为了达到持续字符串随机访问的界限,我们显示了如何将问题减少到线段上以下自然几何选择问题。考虑平面中的一组水平线段。给定的参数$ i $和$ j $,段选择查询返回$ j $ th的最小细分市场(带有$ j $ th最小$ y $ y $ y $ y $ coortion的细分市场),通过$ x $ coordication $ i $横穿垂直线。段选择问题是将一组水平线段预处理成一个紧凑的数据结构,以支持快速的段选择查询。我们提供了一种解决方案,该解决方案使用$ o(n)$ space和支持段选择查询中的$ o(\ log n/ \ log \ log \ log n)$ time,其中$ n $是段数。此外,我们证明,对于使用近线性空间的任何解决方案,此查询时间也是最佳的。

We consider compact representations of collections of similar strings that support random access queries. The collection of strings is given by a rooted tree where edges are labeled by an edit operation (inserting, deleting, or replacing a character) and a node represents the string obtained by applying the sequence of edit operations on the path from the root to the node. The goal is to compactly represent the entire collection while supporting fast random access to any part of a string in the collection. This problem captures natural scenarios such as representing the past history of an edited document or representing highly-repetitive collections. Given a tree with $n$ nodes, we show how to represent the corresponding collection in $O(n)$ space and $O(\log n/ \log \log n)$ query time. This improves the previous time-space trade-offs for the problem. Additionally, we show a lower bound proving that the query time is optimal for any solution using near-linear space. To achieve our bounds for random access in persistent strings we show how to reduce the problem to the following natural geometric selection problem on line segments. Consider a set of horizontal line segments in the plane. Given parameters $i$ and $j$, a segment selection query returns the $j$th smallest segment (the segment with the $j$th smallest $y$-coordinate) among the segments crossing the vertical line through $x$-coordinate $i$. The segment selection problem is to preprocess a set of horizontal line segments into a compact data structure that supports fast segment selection queries. We present a solution that uses $O(n)$ space and support segment selection queries in $O(\log n/ \log \log n)$ time, where $n$ is the number of segments. Furthermore, we prove that that this query time is also optimal for any solution using near-linear space.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源