Mariusgnn：图形神经网络的资源有效核心培训

论文标题

Mariusgnn：图形神经网络的资源有效核心培训

MariusGNN: Resource-Efficient Out-of-Core Training of Graph Neural Networks

论文作者

Waleffe, Roger, Mohoney, Jason, Rekatsinas, Theodoros, Venkataraman, Shivaram

论文摘要

我们研究图形神经网络（GNN）的培训，以了解大型图。我们重新审视使用分布式培训用于数十亿个尺度图的前提，并表明，对于适合主内存或单台计算机的SSD的图形，使用单个GPU可以超越最先进的（SOTA）多GPU解决方案的核心管道式培训。我们介绍了Mariusgnn，这是第一个利用整个存储层次结构（包括磁盘）进行GNN培训的系统。 Mariusgnn介绍了一系列数据组织和算法贡献，这些贡献是1）最小化培训所需的端到端时间，以及2）确保通过基于磁盘的培训学到的模型表现出类似于在记忆中接受完全训练的模型。我们评估了Mariusgnn针对用于学习GNN模型的SOTA系统的Mariusgnn，并发现Mariusgnn中的单GPU训练的准确性比这些系统中的多GPU培训快8倍，因此，引入了一定的数量级货币成本降低。 Mariusgnn在www.marius-project.org上开源。

We study training of Graph Neural Networks (GNNs) for large-scale graphs. We revisit the premise of using distributed training for billion-scale graphs and show that for graphs that fit in main memory or the SSD of a single machine, out-of-core pipelined training with a single GPU can outperform state-of-the-art (SoTA) multi-GPU solutions. We introduce MariusGNN, the first system that utilizes the entire storage hierarchy -- including disk -- for GNN training. MariusGNN introduces a series of data organization and algorithmic contributions that 1) minimize the end-to-end time required for training and 2) ensure that models learned with disk-based training exhibit accuracy similar to those fully trained in memory. We evaluate MariusGNN against SoTA systems for learning GNN models and find that single-GPU training in MariusGNN achieves the same level of accuracy up to 8x faster than multi-GPU training in these systems, thus, introducing an order of magnitude monetary cost reduction. MariusGNN is open-sourced at www.marius-project.org.

下载PDF全文

下载文献需遵守相关版权规定

论文标题