论文标题

编码基于聚合物的数据存储

Coding for Polymer-Based Data Storage

论文作者

Pattabiraman, Srilakshmi, Gabrys, Ryan, Milenkovic, Olgica

论文摘要

由基于聚合物的数据存储平台使用,这些平台使用二进制合成聚合物的链条作为记录介质,并通过串联质谱仪读取内容,我们提出了一个新的代码系列,既可以允许唯一的字符串重建和校正多个质量误差。我们考虑两种方法:第一种方法与非对称错误有关,它基于引入冗余,该冗余与误差数量缩放,并与字符串的长度进行对数。该构造允许字符串仅基于其错误的子字符串组成多键进行唯一重建。我们独特的重建方法背后的关键思想是用任意二进制字符串交织(移动的)加泰罗尼亚 - bertrand路径,并“反射”它们,以迫使前缀和相同长度的后缀以具有不同的权重。该方案的渐近代码速率是一种,解码是通过用于收费公路问题的回溯算法的简化版本来完成的。对于对称误差,我们使用质量信息的多项式表征,并为此设置调整多项式评估代码构建。在此过程中,我们为恒定数量的组成误差开发了新的有效解码算法,并表明该方案的冗余与误差数量四倍地缩放,并与代码增长进行对数。

Motivated by polymer-based data-storage platforms that use chains of binary synthetic polymers as the recording media and read the content via tandem mass spectrometers, we propose a new family of codes that allows for both unique string reconstruction and correction of multiple mass errors. We consider two approaches: The first approach pertains to asymmetric errors and it is based on introducing redundancy that scales linearly with the number of errors and logarithmically with the length of the string. The construction allows for the string to be uniquely reconstructed based only on its erroneous substring composition multiset. The key idea behind our unique reconstruction approach is to interleave (shifted) Catalan-Bertrand paths with arbitrary binary strings and "reflect" them so as to force prefixes and suffixes of the same length to have different weights. The asymptotic code rate of the scheme is one, and decoding is accomplished via a simplified version of the backtracking algorithm used for the Turnpike problem. For symmetric errors, we use a polynomial characterization of the mass information and adapt polynomial evaluation code constructions for this setting. In the process, we develop new efficient decoding algorithms for a constant number of composition errors and show that the redundancy of the scheme scales quadratically with the number of errors and logarithmically with the codelength.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源