论文标题

基于稀疏基质的HPC断层扫描

Sparse Matrix-Based HPC Tomography

论文作者

Marchesini, Stefano, Trivedi, Anuradha, Enfedaque, Pablo, Perciano, Talita, Parkinson, Dilworth

论文摘要

断层成像受益于X射线源,检测器和光学的进步,从而使科学,工程和医学的新颖观察结果受益。这些进步以更快的帧速率,较大的视野或更高分辨率的形式的输入数据急剧增加,因此当前高性能解决方案被广泛用于分析。层析成像仪器的不同,包括重建的硬件:从单个CPU工作站到大型混合CPU/GPU/GPU超级计算机。软件界面和重建引擎的灵活性也受到高度重视,可以轻松开发和原型制作。本文提出了一个新型的软件框架,以解决上述所有要求。提出的解决方案利用了稀疏矩阵矢量乘法的性能提高,并利用MPI的多CPU和GPU重建。该解决方案是在Python中实现的,并依靠CUPY用于快速GPU操作员和CUDA内核集成,以及用于CPU稀疏矩阵计算的Scipy。与以前针对特定用例或硬件量身定制的层析成像解决方案相反,该建议的软件旨在提供灵活,便携式和高性能操作员,这些操作员可用于在不同的生产环境下连续集成,还用于原型,用于针对新的实验环境或算法开发。实验结果表明,我们的实施如何甚至超过全球高级X射线源使用的最先进的软件包。

Tomographic imaging has benefited from advances in X-ray sources, detectors and optics to enable novel observations in science, engineering and medicine. These advances have come with a dramatic increase of input data in the form of faster frame rates, larger fields of view or higher resolution, so high performance solutions are currently widely used for analysis. Tomographic instruments can vary significantly from one to another, including the hardware employed for reconstruction: from single CPU workstations to large scale hybrid CPU/GPU supercomputers. Flexibility on the software interfaces and reconstruction engines are also highly valued to allow for easy development and prototyping. This paper presents a novel software framework for tomographic analysis that tackles all aforementioned requirements. The proposed solution capitalizes on the increased performance of sparse matrix-vector multiplication and exploits multi-CPU and GPU reconstruction over MPI. The solution is implemented in Python and relies on CuPy for fast GPU operators and CUDA kernel integration, and on SciPy for CPU sparse matrix computation. As opposed to previous tomography solutions that are tailor-made for specific use cases or hardware, the proposed software is designed to provide flexible, portable and high-performance operators that can be used for continuous integration at different production environments, but also for prototyping new experimental settings or for algorithmic development. The experimental results demonstrate how our implementation can even outperform state-of-the-art software packages used at advanced X-ray sources worldwide.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源