论文标题
近似痕量重建
Approximate Trace Reconstruction
论文作者
论文摘要
在通常的跟踪重建问题中,目标是在多次通过删除通道后通过独立的删除通道后,精确地重建一条未知的长度字符串,产生一组痕迹(即字符串的随机子序列)。我们考虑了近似重建的放松问题。在这里,目标是输出一个接近原始距离的字符串,而使用的轨迹要比精确重建所需的范围要少得多。我们提出了几种算法,这些算法可以大致重建属于某些类的字符串,其中估计值在$ n/\ mathrm {polylog}(n)$中,并且我们仅使用$ \ mathrm {polylog}(polylog}(n)$ traces(有时只有一个跟踪)。这些类包含需要线性数量的迹线进行精确重建的字符串,并且与典型的随机字符串完全不同。从技术的角度来看,我们的算法大约通过对齐痕迹的密集区域并使用合适的长度来近似每个区域来大致重建未知字符串的连续子字符串。为了补充我们的算法,我们提出了一个一般的黑盒下限,用于近似重建,在下限上建立,以区分最坏情况下的两个候选输入字符串。特别是,这表明,在$ n^{1/3-δ} $内的近似值需要$ n^{1 +3Δ/2}/\ mathrm {polylog}(n)$ traces $ 0 <Δ<1/3 $,在最坏的情况下。
In the usual trace reconstruction problem, the goal is to exactly reconstruct an unknown string of length $n$ after it passes through a deletion channel many times independently, producing a set of traces (i.e., random subsequences of the string). We consider the relaxed problem of approximate reconstruction. Here, the goal is to output a string that is close to the original one in edit distance while using much fewer traces than is needed for exact reconstruction. We present several algorithms that can approximately reconstruct strings that belong to certain classes, where the estimate is within $n/\mathrm{polylog}(n)$ edit distance, and where we only use $\mathrm{polylog}(n)$ traces (or sometimes just a single trace). These classes contain strings that require a linear number of traces for exact reconstruction and which are quite different from a typical random string. From a technical point of view, our algorithms approximately reconstruct consecutive substrings of the unknown string by aligning dense regions of traces and using a run of a suitable length to approximate each region. To complement our algorithms, we present a general black-box lower bound for approximate reconstruction, building on a lower bound for distinguishing between two candidate input strings in the worst case. In particular, this shows that approximating to within $n^{1/3 - δ}$ edit distance requires $n^{1 + 3δ/2}/\mathrm{polylog}(n)$ traces for $0< δ< 1/3$ in the worst case.