论文标题
大量平行图绘图和表示学习
Massively Parallel Graph Drawing and Representation Learning
论文作者
论文摘要
为了充分利用现代多核处理器的性能潜力,必须通过多种方式并行将机器学习和数据挖掘算法并行。当今的CPU由多个内核组成,每个内核都以下是独立的控制线,每个核心都配备了多个算术单元,这些单元可以在多个数据对象的向量上执行相同的操作。图形嵌入,即将图形的顶点转换为数值向量是一项非常重要的数据挖掘任务,对于图形图(低维矢量)和图表表示(高维矢量)非常有用。在本文中,我们提出了多重点(通过最小化预测性熵来嵌入的图形嵌入),这是一种信息理论方法,可以产生低维和高维矢量。 MulticoreGempe应用MIMD(使用OpenMP多个指令)和SIMD(使用AVX-512)并行性SIMD(单个指令多个数据)。我们提出的一般思想适用于其他基于图的算法,例如\ emph {vectorized asshing}和\ emph {vectorized减少}。我们的实验评估证明了我们方法的优势。
To fully exploit the performance potential of modern multi-core processors, machine learning and data mining algorithms for big data must be parallelized in multiple ways. Today's CPUs consist of multiple cores, each following an independent thread of control, and each equipped with multiple arithmetic units which can perform the same operation on a vector of multiple data objects. Graph embedding, i.e. converting the vertices of a graph into numerical vectors is a data mining task of high importance and is useful for graph drawing (low-dimensional vectors) and graph representation learning (high-dimensional vectors). In this paper, we propose MulticoreGEMPE (Graph Embedding by Minimizing the Predictive Entropy), an information-theoretic method which can generate low and high-dimensional vectors. MulticoreGEMPE applies MIMD (Multiple Instructions Multiple Data, using OpenMP) and SIMD (Single Instructions Multiple Data, using AVX-512) parallelism. We propose general ideas applicable in other graph-based algorithms like \emph{vectorized hashing} and \emph{vectorized reduction}. Our experimental evaluation demonstrates the superiority of our approach.