论文标题
拆分制度中代码转换的带宽成本
Bandwidth Cost of Code Conversions in the Split Regime
论文作者
论文摘要
分布式存储系统必须在长时间内存储大量数据。为了避免由于设备故障而导致的数据丢失,使用$ [n,k] $擦除代码将$ k $数据符号编码为$ n $符号的代码字,这些符号存储在不同的设备上。但是,设备故障率在数据的整个寿命中变化,并且根据这些更改调整$ n $和$ k $,可节省大量的存储空间。代码转换是将初始$ [n^i,k^i] $代码的多个代码字转换为最终$ [n^f,k^f] $代码的代码字,该代码将其解码为相同的一组数据符号。在本文中,我们研究转换带宽,定义为转换过程中节点之间传输的数据总量。特别是,我们考虑了初始代码和最终代码为MD的情况,并且单个初始代码分为几个最终代码字($ k^i =λ^f k^f $ for Integer $λ^f \ geq 2 $),称为拆分制度。我们在拆分方案中的转换带宽上得出了下限,并提出了显着降低转换带宽的结构,并且对于某些参数是最佳的。
Distributed storage systems must store large amounts of data over long periods of time. To avoid data loss due to device failures, an $[n,k]$ erasure code is used to encode $k$ data symbols into a codeword of $n$ symbols that are stored across different devices. However, device failure rates change throughout the life of the data, and tuning $n$ and $k$ according to these changes has been shown to save significant storage space. Code conversion is the process of converting multiple codewords of an initial $[n^I,k^I]$ code into codewords of a final $[n^F,k^F]$ code that decode to the same set of data symbols. In this paper, we study conversion bandwidth, defined as the total amount of data transferred between nodes during conversion. In particular, we consider the case where the initial and final codes are MDS and a single initial codeword is split into several final codewords ($k^I=λ^F k^F$ for integer $λ^F \geq 2$), called the split regime. We derive lower bounds on the conversion bandwidth in the split regime and propose constructions that significantly reduce conversion bandwidth and are optimal for certain parameters.