论文标题

SGQUANT:通过专门量化在图神经网络上挤压最后位

SGQuant: Squeezing the Last Bit on Graph Neural Networks with Specialized Quantization

论文作者

Feng, Boyuan, Wang, Yuke, Li, Xu, Yang, Shu, Peng, Xueqiao, Ding, Yufei

论文摘要

随着基于图的学​​习的日益普及,图形神经网络(GNN)由于其准确性很高而赢得了研究和行业领域的广泛关注。但是,现有的GNN遭受了高内存足迹的影响(例如节点嵌入功能)。这种高内存的足迹使潜在的应用程序对内存约束的设备(例如广泛部署的物联网设备)构成了影响。为此,我们提出了一种专门的GNN量化方案,即SGQUANT,以系统地减少GNN记忆消耗。具体而言,我们首先提出了GNN量化的量化算法设计和GNN量化微调方案,以减少记忆消耗,同时保持准确性。然后,我们研究了GNN计算以不同级别(组件,图形拓扑和层)运行的多粒性量化策略。此外,我们提供了自动位选择(ABS),以查明上述多粒度量化的最合适的量化位。密集实验表明,与原始的全精度GNN相比,SGQUANT可以有效地将记忆足迹从4.25倍降低到31.9倍,同时将准确性下降到平均为0.4%。

With the increasing popularity of graph-based learning, Graph Neural Networks (GNNs) win lots of attention from the research and industry field because of their high accuracy. However, existing GNNs suffer from high memory footprints (e.g., node embedding features). This high memory footprint hurdles the potential applications towards memory-constrained devices, such as the widely-deployed IoT devices. To this end, we propose a specialized GNN quantization scheme, SGQuant, to systematically reduce the GNN memory consumption. Specifically, we first propose a GNN-tailored quantization algorithm design and a GNN quantization fine-tuning scheme to reduce memory consumption while maintaining accuracy. Then, we investigate the multi-granularity quantization strategy that operates at different levels (components, graph topology, and layers) of GNN computation. Moreover, we offer an automatic bit-selecting (ABS) to pinpoint the most appropriate quantization bits for the above multi-granularity quantizations. Intensive experiments show that SGQuant can effectively reduce the memory footprint from 4.25x to 31.9x compared with the original full-precision GNNs while limiting the accuracy drop to 0.4% on average.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源