通过渠道的Hessian意识到神经网络的微量加权量化

论文标题

通过渠道的Hessian意识到神经网络的微量加权量化

Channel-wise Hessian Aware trace-Weighted Quantization of Neural Networks

论文作者

Qian, Xu, Li, Victor, Darren, Crews

论文摘要

事实证明，二阶信息在确定神经网络权重和激活的冗余非常有效。最近的论文建议使用Hessian的权重和激活来进行混合精确量化，并实现最先进的结果。但是，先前的工作仅专注于为每层选择位，而一层中不同通道的冗余也有很大差异。这主要是因为确定每个通道的位的复杂性对于原始方法太高。在这里，我们介绍了频道的Hessian意识到痕量加权量化（CW-HAWQ）。 CW-HAWQ使用Hessian Trace来确定不同激活和权重的不同通道的相对灵敏度顺序。更重要的是，CW-HAWQ建议使用深度加固学习（DRL）深层确定性策略梯度（DDPG）的代理，以找到不同量化位的最佳比率，并根据Hessian Trace Order将位分配给渠道。与传统的基于Automl的混合精制方法相比，CW-HAWQ中的状态数量要小得多，因为我们只需要搜索量化比例。将CW-HAWQ与最新的CW-HAWQ进行比较，这表明我们可以为多个网络获得更好的结果。

Second-order information has proven to be very effective in determining the redundancy of neural network weights and activations. Recent paper proposes to use Hessian traces of weights and activations for mixed-precision quantization and achieves state-of-the-art results. However, prior works only focus on selecting bits for each layer while the redundancy of different channels within a layer also differ a lot. This is mainly because the complexity of determining bits for each channel is too high for original methods. Here, we introduce Channel-wise Hessian Aware trace-Weighted Quantization (CW-HAWQ). CW-HAWQ uses Hessian trace to determine the relative sensitivity order of different channels of activations and weights. What's more, CW-HAWQ proposes to use deep Reinforcement learning (DRL) Deep Deterministic Policy Gradient (DDPG)-based agent to find the optimal ratios of different quantization bits and assign bits to channels according to the Hessian trace order. The number of states in CW-HAWQ is much smaller compared with traditional AutoML based mix-precision methods since we only need to search ratios for the quantization bits. Compare CW-HAWQ with state-of-the-art shows that we can achieve better results for multiple networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题