论文标题
在替换错误下,DNA存储中的串联代码可实现的速率
Achievable Rates of Concatenated Codes in DNA Storage under Substitution Errors
论文作者
论文摘要
在本文中,我们研究了脱氧核酸(DNA)存储通道上串联编码方案的可实现速率。我们的频道模型包含了基于DNA的数据存储的主要特征。首先,信息存储在许多短的DNA链上。其次,将链以无序的方式存储在存储介质内,并且每个链都经过多次复制。第三,以无法控制的方式访问数据,即从培养基中绘制随机链并可能有错误。作为我们的结果之一,我们表明,通道容量与可实现的标准串联代码的可实现速率之间存在显着差距,其中一条线与内部块相对应。实际上,对于其他渠道(例如$ q $ - y-ary对称频道),串联代码已知可以实现容量,这实际上是令人惊讶的。我们进一步提出了一个修改的串联编码方案,通过将几条链组合到一个内部块中,从而可以缩小差距并达到接近容量的速率。
In this paper, we study achievable rates of concatenated coding schemes over a deoxyribonucleic acid (DNA) storage channel. Our channel model incorporates the main features of DNA-based data storage. First, information is stored on many, short DNA strands. Second, the strands are stored in an unordered fashion inside the storage medium and each strand is replicated many times. Third, the data is accessed in an uncontrollable manner, i.e., random strands are drawn from the medium and received, possibly with errors. As one of our results, we show that there is a significant gap between the channel capacity and the achievable rate of a standard concatenated code in which one strand corresponds to an inner block. This is in fact surprising as for other channels, such as $q$-ary symmetric channels, concatenated codes are known to achieve the capacity. We further propose a modified concatenated coding scheme by combining several strands into one inner block, which allows to narrow the gap and achieve rates that are close to the capacity.