论文标题

参数化后缀托盘

The Parameterized Suffix Tray

论文作者

Fujisato, Noriki, Nakashima, Yuto, Inenaga, Shunsuke, Bannai, Hideo, Takeda, Masayuki

论文摘要

令$σ$和$π$分别为脱节字母,分别称为静态字母和参数化字母。如果存在$σ$和$σ$的重命名的双匹配$ f $,则两条字符串$ x $和$ y $相等的$ c \杯π$相等的匹配(p匹配),这是$σ$和$π$的$ f $ f $,这是$σ$上的身份,并将$ x $的字符映射到$ y $的$ y $,以便两条串变得相同。在文本中找到给定模式的P匹配事件的问题的索引版本是字符串匹配中的一个良好的主题。在本文中,我们提出了一种用于p匹配的最新索引结构,称为输入文本$ t $的参数化后缀托盘,由$ \ m athsf {pstray}(t)$表示。我们表明,$ \ Mathsf {pstray}(t)$占用$ O(n)$空间,并支持$ O(m + \ log(σ +π) + \ mathit {occ})$ time中的图案匹配的查询,其中$ n $是$ t $ $ p $ p $ p $ p $ t $ t $ t $ t $ t $ t $ t $ t $ t $ t $ t | $σ$是$ t $中的$ |σ| $的不同符号的数量,$ \ mathit {occ} $是$ t $中$ p $的p匹配事件的数量。我们还提出了如何从$ t $的参数化后缀树中构建$ \ mathsf {pstray}(t)$。

Let $Σ$ and $Π$ be disjoint alphabets, respectively called the static alphabet and the parameterized alphabet. Two strings $x$ and $y$ over $Σ\cup Π$ of equal length are said to parameterized match (p-match) if there exists a renaming bijection $f$ on $Σ$ and $Π$ which is identity on $Σ$ and maps the characters of $x$ to those of $y$ so that the two strings become identical. The indexing version of the problem of finding p-matching occurrences of a given pattern in the text is a well-studied topic in string matching. In this paper, we present a state-of-the-art indexing structure for p-matching called the parameterized suffix tray of an input text $T$, denoted by $\mathsf{PSTray}(T)$. We show that $\mathsf{PSTray}(T)$ occupies $O(n)$ space and supports pattern matching queries in $O(m + \log (σ+π) + \mathit{occ})$ time, where $n$ is the length of $T$, $m$ is the length of a query pattern $P$, $π$ is the number of distinct symbols of $|Π|$ in $T$, $σ$ is the number of distinct symbols of $|Σ|$ in $T$ and $\mathit{occ}$ is the number of p-matching occurrences of $P$ in $T$. We also present how to build $\mathsf{PSTray}(T)$ in $O(n)$ time from the parameterized suffix tree of $T$.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源