Ramanujan二分制产品，用于有效块稀疏神经网络

论文标题

Ramanujan二分制产品，用于有效块稀疏神经网络

Ramanujan Bipartite Graph Products for Efficient Block Sparse Neural Networks

论文作者

Vooturi, Dharma Teja, Varma, Girish, Kothapalli, Kishore

论文摘要

稀疏的神经网络被证明可以为较密集的版本提供准确的预测，同时还可以最大程度地减少执行算术操作的数量。但是，像GPU这样的当前硬件只能利用结构化的稀疏模式，以提高效率。因此，稀疏的神经网络的运行时间可能与所需的算术操作不符。在这项工作中，我们提出了使用图产品理论来生成结构化的多层块稀疏神经网络的RBGP（Ramanujan二方图产品）框架。我们还建议使用Ramanujan图的产品，以给定的稀疏度提供最佳的连通性。这本质上可以确保i。）网络具有结构化的块稀疏性，其运行时有效算法存在II。）该模型具有很高的预测准确性，因为图III的连接具有更好的表达能力。）图数据结构具有可有效存储在记忆中的简洁表示。我们使用框架设计一种称为RBGP4的特定连接模式，该模式可以有效利用GPU上可用的内存层次结构。我们通过使用VGG19和WideresNet-40-4网络对CIFAR数据集进行图像分类任务进行基准测试，并分别实现5-9倍和2-5X运行时的增长，分别超过了非结构化和阻止稀疏模式，同时实现了相同的准确性。

Sparse neural networks are shown to give accurate predictions competitive to denser versions, while also minimizing the number of arithmetic operations performed. However current hardware like GPU's can only exploit structured sparsity patterns for better efficiency. Hence the run time of a sparse neural network may not correspond to the arithmetic operations required. In this work, we propose RBGP( Ramanujan Bipartite Graph Product) framework for generating structured multi level block sparse neural networks by using the theory of Graph products. We also propose to use products of Ramanujan graphs which gives the best connectivity for a given level of sparsity. This essentially ensures that the i.) the networks has the structured block sparsity for which runtime efficient algorithms exists ii.) the model gives high prediction accuracy, due to the better expressive power derived from the connectivity of the graph iii.) the graph data structure has a succinct representation that can be stored efficiently in memory. We use our framework to design a specific connectivity pattern called RBGP4 which makes efficient use of the memory hierarchy available on GPU. We benchmark our approach by experimenting on image classification task over CIFAR dataset using VGG19 and WideResnet-40-4 networks and achieve 5-9x and 2-5x runtime gains over unstructured and block sparsity patterns respectively, while achieving the same level of accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题