论文标题
在混合超级计算机架构上实现并行退火方法中,复制品重新分布的算法
Algorithm for the replica redistribution in the implementation of parallel annealing method on the hybrid supercomputer architecture
论文作者
论文摘要
平行退火方法是大规模仿真的有前途方法之一,它可能是任何并行体系结构上可能可扩展的。我们介绍了将CUDA和MPI结合的混合程序体系结构的算法实现。问题是要使所有通用图形处理单元设备尽可能忙于重新分配复制品并有效地做到这一点。我们提供了基于Intel Skylake/Nvidia V100的硬件的测试详细信息,该硬件并行运行超过200万个Ising模型样本。结果非常乐观,因为随着模拟系统的增长,加速度朝着完美的线路发展。
The parallel annealing method is one of the promising approaches for large scale simulations as potentially scalable on any parallel architecture. We present an implementation of the algorithm on the hybrid program architecture combining CUDA and MPI. The problem is to keep all general-purpose graphics processing unit devices as busy as possible redistributing replicas and to do that efficiently. We provide details of the testing on Intel Skylake/Nvidia V100 based hardware running in parallel more than two million replicas of the Ising model sample. The results are quite optimistic because the acceleration grows toward the perfect line with the growing complexity of the simulated system.