论文标题
虚拟磁盘快照管理大规模管理
Virtual Disk Snapshot Management at Scale
论文作者
论文摘要
与其他资源(例如CPU,内存和网络)相反,通过直接访问可以有效地实现虚拟化,磁盘虚拟化是特殊的。在本文中,我们做出了四个贡献。我们的第一个贡献是在公共大规模云基础架构中磁盘利用的表征。它揭示了长快照链的存在,有时由多达1000个文件组成。我们的第二个贡献是表明长链通过实验测量导致性能和记忆足迹的可伸缩性问题。我们的第三个贡献是QCOW2格式及其在QEMU中的驱动程序的扩展,以应对确定的可伸缩性挑战。我们的第四个贡献是对我们原型的彻底评估,称为SQEMU,表明它带来了显着的性能增强和记忆足迹的降低。例如,与在长度为500的快照链上,它将岩石db的吞吐量提高了约48%。该链上的内存开销也减少了15倍。
Contrary to the other resources such as CPU, memory, and network, for which virtualization is efficiently achieved through direct access, disk virtualization is peculiar. In this paper, we make four contributions. Our first contribution is the characterization of disk utilization in a public large-scale cloud infrastructure. It reveals the presence of long snapshot chains, sometimes composed of up to 1000 files. Our second contribution is to show that long chains lead to performance and memory footprint scalability issues by experimental measurements. Our third contribution is the extension of the Qcow2 format and its driver in Qemu to address the identified scalability challenges. Our fourth contribution is the thorough evaluation of our prototype, called sQemu, demonstrating that it brings significant performance enhancements and memory footprint reduction. For example, it improves the throughput of RocksDB by about 48% compared to vanilla Qemu on a snapshot chain of length 500. The memory overhead on that chain is also reduced by 15x.