论文标题
“更聪明的” NIC用于更快的分子动力学:案例研究
"Smarter" NICs for faster molecular dynamics: a case study
论文作者
论文摘要
这项工作评估了使用“智能”网络接口卡(智能NIC)作为Minimd Molecular Dynamics代理应用程序的计算加速器的好处。加速器是NVIDIA的BlueField-2卡,其中包括一个8核手臂处理器以及少量的DRAM和存储空间。与使用MicroBenchs和Minimd的标准Intel Server主机相比,我们测试了这些卡的网络和数据移动性能。在Minimd中,我们确定了两个不同类别的计算类别,即核心计算和维护计算,它们是按顺序执行的。我们重组算法和代码以削弱这种依赖性并增加任务并行性,从而使与主机同时同时增加Bluefield-2的利用率。我们在一个由16个双插入的Intel Broadwell主机节点组成的集群上评估了我们的实现,每个主机节点一个Bluefield-2。我们的结果表明,尽管BlueField-2的总体计算性能有限,但使用它们具有修改后的最小算法允许在主机CPU基线上最高20%的速度,而模拟精度无损失。
This work evaluates the benefits of using a "smart" network interface card (SmartNIC) as a compute accelerator for the example of the MiniMD molecular dynamics proxy application. The accelerator is NVIDIA's BlueField-2 card, which includes an 8-core Arm processor along with a small amount of DRAM and storage. We test the networking and data movement performance of these cards compared to a standard Intel server host using microbenchmarks and MiniMD. In MiniMD, we identify two distinct classes of computation, namely core computation and maintenance computation, which are executed in sequence. We restructure the algorithm and code to weaken this dependence and increase task parallelism, thereby making it possible to increase utilization of the BlueField-2 concurrently with the host. We evaluate our implementation on a cluster consisting of 16 dual-socket Intel Broadwell host nodes with one BlueField-2 per host-node. Our results show that while the overall compute performance of BlueField-2 is limited, using them with a modified MiniMD algorithm allows for up to 20% speedup over the host CPU baseline with no loss in simulation accuracy.