论文标题

异质CPU-GPU系统上有效的基质分解

Efficient Matrix Factorization on Heterogeneous CPU-GPU Systems

论文作者

Yu, Yuanhang, Wen, Dong, Zhang, Ying, Wang, Xiaoyang, Zhang, Wenjie, Lin, Xuemin

论文摘要

基质分解(MF)已广泛应用于机器学习和数据挖掘。已经研究了大量算法以分解矩阵。其中,随机梯度下降(SGD)是一种常用的方法。由于通用数据并行应用中GPU的流行,具有多核CPU和GPU的异质系统最近变得越来越有前途。由于MF的计算成本很高,我们旨在通过利用异质多处理器的大量并行处理能力来提高基于SGD的MF计算的效率。在异质CPU-GPU系统上并行SGD算法的主要挑战在于矩阵部门的粒度和分配任务的策略。我们设计了一种新颖的策略,可以通过考虑两个方面将矩阵分为一组块。首先,我们观察到应将矩阵划分不均匀,并且应将相对较大的块分配给GPU,以使GPU的计算能力饱和。除了利用硬件的特征外,还应平衡分配给两种类型硬件的工作负载。针对最终的分区策略,我们设计了一种针对我们问题的成本模型,以准确估计硬件在不同数据大小上的性能。动态调度策略也用于进一步平衡实际工作量。广泛的实验表明,我们提出的算法以高质量的训练质量达到了高效率。

Matrix Factorization (MF) has been widely applied in machine learning and data mining. A large number of algorithms have been studied to factorize matrices. Among them, stochastic gradient descent (SGD) is a commonly used method. Heterogeneous systems with multi-core CPUs and GPUs have become more and more promising recently due to the prevalence of GPUs in general-purpose data-parallel applications. Due to the large computational cost of MF, we aim to improve the efficiency of SGD-based MF computation by utilizing the massive parallel processing power of heterogeneous multiprocessors. The main challenge in parallel SGD algorithms on heterogeneous CPU-GPU systems lies in the granularity of the matrix division and the strategy to assign tasks. We design a novel strategy to divide the matrix into a set of blocks by considering two aspects. First, we observe that the matrix should be divided nonuniformly, and relatively large blocks should be assigned to GPUs to saturate the computing power of GPUs. In addition to exploiting the characteristics of hardware, the workloads assigned to two types of hardware should be balanced. Aiming at the final division strategy, we design a cost model tailored for our problem to accurately estimate the performance of hardware on different data sizes. A dynamic scheduling policy is also used to further balance workloads in practice. Extensive experiments show that our proposed algorithm achieves high efficiency with a high quality of training quality.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源