论文标题

平行结构化的分裂和框架算法,用于对称三角形特征值问题

A parallel structured divide-and-conquer algorithm for symmetric tridiagonal eigenvalue problems

论文作者

Liao, Xia, Li, Shengguo, Lu, Yutong, Roman, Jose E.

论文摘要

在本文中,提出了基于Scalapack的对称三轴对称矩阵和平行的结构矩阵乘法算法(称为psmma)的对称的三角形矩阵。通过矩阵 - 矩阵乘法计算特征向量是分裂和争议算法中最昂贵的部分,而与此类乘法相关的矩阵之一是等级结构的cauchy类矩阵。通过利用这一特定属性,PSMMA通过使用cauchy样矩阵的发电机而无需任何通信来构建局部矩阵,并通过使用结构化的低级别近似算法进一步降低了计算成本。因此,通信和计算成本都降低了。实验结果表明,PSMMA和PSDC都高度可扩展,至少扩展到4096个过程。 PSDC的可伸缩性比[J.计算。应用。数学。 344(2018)512--520],仅缩放到相同矩阵的300个过程。与Scalapack中的\ texttt {pdstedc}进行比较,PSDC总是更快,并实现$ 1.4 $ x- $ x- $ 1.6 $ x的速度,对于某些矩阵很少。 PSDC也与ELPA相当,当使用许多过程时,PSDC比ELPA快于ELPA。

In this paper, a parallel structured divide-and-conquer (PSDC) eigensolver is proposed for symmetric tridiagonal matrices based on ScaLAPACK and a parallel structured matrix multiplication algorithm, called PSMMA. Computing the eigenvectors via matrix-matrix multiplications is the most computationally expensive part of the divide-and-conquer algorithm, and one of the matrices involved in such multiplications is a rank-structured Cauchy-like matrix. By exploiting this particular property, PSMMA constructs the local matrices by using generators of Cauchy-like matrices without any communication, and further reduces the computation costs by using a structured low-rank approximation algorithm. Thus, both the communication and computation costs are reduced. Experimental results show that both PSMMA and PSDC are highly scalable and scale to 4096 processes at least. PSDC has better scalability than PHDC that was proposed in [J. Comput. Appl. Math. 344 (2018) 512--520] and only scaled to 300 processes for the same matrices. Comparing with \texttt{PDSTEDC} in ScaLAPACK, PSDC is always faster and achieves $1.4$x--$1.6$x speedup for some matrices with few deflations. PSDC is also comparable with ELPA, with PSDC being faster than ELPA when using few processes and a little slower when using many processes.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源