直接量化训练高度准确的低位深神经网络

论文标题

直接量化训练高度准确的低位深神经网络

Direct Quantization for Training Highly Accurate Low Bit-width Deep Neural Networks

论文作者

Hoang, Tuan, Do, Thanh-Toan, Nguyen, Tam V., Cheung, Ngai-Man

论文摘要

本文提出了两种新型技术，以训练具有低位宽度重量和激活的深卷积神经网络。首先，为了获得低位宽度的权重，大多数现有方法通过对全精度网络权重执行量化来获得量化的权重。但是，这种方法会导致一些不匹配：梯度下降更新了完整精确的权重，但不会更新量化的权重。为了解决这个问题，我们提出了一种新颖的方法，该方法可以启用{Direct}量化权重{具有可学习的量化级别}的更新，以使用梯度下降来最大程度地减少成本函数。其次，为了获得低宽度激活，现有作品平均考虑所有通道。但是，激活量化器可能会偏向一些具有高变化的通道。为了解决这个问题，我们提出了一种考虑单个渠道的量化错误的方法。通过这种方法，我们可以学习激活量化器，以最大程度地减少大多数通道中的量化误差。实验结果表明，我们提出的方法在CIFAR-100和Imagenet数据集上使用Alexnet，Resnet和MobilenetV2架构实现了图像分类任务上的最新性能。

This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations. First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights. However, this approach would result in some mismatch: the gradient descent updates full-precision weights, but it does not update the quantized weights. To address this issue, we propose a novel method that enables {direct} updating of quantized weights {with learnable quantization levels} to minimize the cost function using gradient descent. Second, to obtain low bit-width activations, existing works consider all channels equally. However, the activation quantizers could be biased toward a few channels with high-variance. To address this issue, we propose a method to take into account the quantization errors of individual channels. With this approach, we can learn activation quantizers that minimize the quantization errors in the majority of channels. Experimental results demonstrate that our proposed method achieves state-of-the-art performance on the image classification task, using AlexNet, ResNet and MobileNetV2 architectures on CIFAR-100 and ImageNet datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题