论文标题

发行:加速临时数据存储的高性能计算处理

DisTRaC: Accelerating High Performance Compute Processing for Temporary Data Storage

论文作者

Mason-Williams, Gabryel, Bond, Dave, Basham, Mark

论文摘要

高性能计算(HPC)群集通常会产生中间文件,作为代码执行的一部分,消息传递的一部分并非总是可以向这些集群作业提供数据。在这些情况下,I/O返回中央分布式存储以允许跨节点数据共享。这些系统通常是高性能的,其特征是它们的每TB成本高以及对工作负载类型的敏感性,例如被调整为小文件I/O。但是,计算节点通常具有大量的RAM,因此,在处理系统的寿命或可靠性并不那么重要的中间文件时,可以使用本地RAM磁盘来获得性能益处。在本文中,我们通过创建一个可以与对象存储系统CEPH进行交互的RAM块以及创建一个部署工具来有效地部署HPC基础架构的部署工具,以说明如何解决此问题。这项工作导致了一个比钻石中的中央高性能分布式存储系统更具性能的系统,该系统降低了I/O的开销和处理时间,SAVU(一种层析成像数据处理应用程序)分别为81.04%和8.32%。

High Performance Compute (HPC) clusters often produce intermediate files as part of code execution and message passing is not always possible to supply data to these cluster jobs. In these cases, I/O goes back to central distributed storage to allow cross node data sharing. These systems are often high performance and characterised by their high cost per TB and sensitivity to workload type such as being tuned to small or large file I/O. However, compute nodes often have large amounts of RAM, so when dealing with intermediate files where longevity or reliability of the system is not as important, local RAM disks can be used to obtain performance benefits. In this paper we show how this problem was tackled by creating a RAM block that could interact with the object storage system Ceph, as well as creating a deployment tool to deploy Ceph on HPC infrastructure effectively. This work resulted in a system that was more performant than the central high performance distributed storage system used at Diamond reducing I/O overhead and processing time for Savu, a tomography data processing application, by 81.04% and 8.32% respectively.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源