论文标题

有效的整数仅量卷积神经网络

Efficient Integer-Arithmetic-Only Convolutional Neural Networks

论文作者

Zhao, Hengrui, Liu, Dong, Li, Houqiang

论文摘要

已证明仅整数算术网络可有效地降低计算成本并确保跨平台的一致性。但是,以前的工作通常报告推理精度的下降,而将训练良好的浮点数(FPN)网络转换为整数网络。我们分析了该词素,发现下降是由于激活量化的。具体而言,当我们用有限的relu替换常规依赖时,如何为每个神经元设置界限是一个关键问题。考虑到激活量化误差和网络学习能力之间的权衡,我们设定了一个经验规则,以调整每个有界依赖的边界。我们还设计了一种机制来处理功能图添加和特征图串联的情况。基于提出的方法,我们训练的8位整数Resnet优于Google Tensorflow和Nvidia的Tensorrt的8位网络,以供图像识别。我们还在VDSR上进行了图像超分辨率和VRCNN的实验,以减少压缩伪影,这两者都适用于本来需要高推理准确性的回归任务。我们的整数网络作为相应的FPN网络实现等效性能,但仅具有1/4的内存成本,并且在现代GPU上运行速度更快2倍。我们的代码和模型可以在github.com/hengruiz/brelu上找到。

Integer-arithmetic-only networks have been demonstrated effective to reduce computational cost and to ensure cross-platform consistency. However, previous works usually report a decline in the inference accuracy when converting well-trained floating-point-number (FPN) networks into integer networks. We analyze this phonomenon and find that the decline is due to activation quantization. Specifically, when we replace conventional ReLU with Bounded ReLU, how to set the bound for each neuron is a key problem. Considering the tradeoff between activation quantization error and network learning ability, we set an empirical rule to tune the bound of each Bounded ReLU. We also design a mechanism to handle the cases of feature map addition and feature map concatenation. Based on the proposed method, our trained 8-bit integer ResNet outperforms the 8-bit networks of Google's TensorFlow and NVIDIA's TensorRT for image recognition. We also experiment on VDSR for image super-resolution and on VRCNN for compression artifact reduction, both of which serve for regression tasks that natively require high inference accuracy. Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPUs. Our code and models can be found at github.com/HengRuiZ/brelu.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源