论文标题

图形数据的新观点和大型图数据遍历的通用和高效方法

A New Perspective of Graph Data and A Generic and Efficient Method for Large Scale Graph Data Traversal

论文作者

Zhang, Chenglong

论文摘要

BFS算法是一种基本的图形数据处理算法,许多其他图形数据处理算法具有与BFS算法相似的架构特征,并且可以基于BFS算法模型来构建。我们详细分析了图算法和传统高性能算法之间的差异,提出了一种将算法分类为数据独立算法和数据相关算法的新方法,并基于其与数据的运行时间相关性,并使用此新分类来解释本文提出的方法的有效性。通过对图形数据的更深入的分析,我们提出了一个新的基本观点,以了解图形数据,在两个基本数据结构,图形和树之间建立链接,并将图形数据视为由较小的子图和边缘树组成。小度的顶点被认为是随机内存访问的重要原因之一。基于此,我们提出了一种一般,易于实现和高效的图形数据处理方法,其基本思想是分别处理低度顶点和核心子图,从而显着降低了随机内存访问的大小并提高内存访问的效率。最后,我们评估了该方法在三个主要数据中心计算平台(Intel,AMD和ARM)上的性能,并且实验表明,它分别带来了19.7%,31.8%和17.9%的性能提高,性能功率比为282.70 mteps/s的手臂平台,在绿色图中,在11月11日在2019年11月的Green Grapt500中排名2019年11月的Green Grapt500。

The BFS algorithm is a basic graph data processing algorithm and many other graph data processing algorithms have similar architectural features with BFS algorithm and can be built on the basis of BFS algorithm model. We analyze the differences between graph algorithms and traditional high-performance algorithms in detail, propose a new way of classifying algorithms into data independent algorithm and data correlation algorithm based on their run-time correlation with data, and use this new classification to explain the validity of the methods proposed in this paper. Through a deeper analysis of graph data, we propose a new fundamental perspective on understanding graph data, establishing a link between two basic data structures, graph and tree, and viewing graph data as consisting of smaller subgraphs and edge trees. Small degree vertices are found to be one of important cause of random memory access. Based on this, we propose a general, easy to implement, and efficient method for graph data processing, with the basic idea of treating low-degree vertices and core subgraphs separately, thus significantly reducing the size of random memory access and improving the efficiency of memory access. Finally, we evaluated the performance of the method on three major data center computing platforms (Intel, AMD, and ARM), and the experiments showed that it brought 19.7%, 31.8% and 17.9% performance improvement, respectively, with a performance-power ratio of 282.70 MTEPS/s on the ARM platform, ranking it among the Green graph500 in November 2019. World No. 1 on the big dataset list.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源