再次使repvgg更大：一种量化的方法

论文标题

再次使repvgg更大：一种量化的方法

Make RepVGG Greater Again: A Quantization-aware Approach

论文作者

Chu, Xiangxiang, Li, Liang, Zhang, Bo

论文摘要

性能和推理速度之间的权衡对于实际应用至关重要。建筑重新聚体化获得了更好的权衡，并且它正在成为现代卷积神经网络中日益流行的成分。但是，当需要INT8推断时，其量化性能通常太差而无法部署（超过20％的IMPENET上的前1个精度下降）。在本文中，我们深入研究了这种故障的基本机制，在该机制中，原始设计不可避免地会扩大量化误差。我们提出一种简单，健壮且有效的补救措施，以具有量化友好的结构，该结构也具有重新聚集的好处。我们的方法极大地弥合了REPVGG的INT8和FP32精度之间的差距。没有铃铛和口哨声，通过标准训练后量化，ImageNet上的Top-1精度下降降低了2％。此外，我们的方法还达到了与repvgg相似的FP32性能。关于检测和语义分割任务的广泛实验验证其概括。

The tradeoff between performance and inference speed is critical for practical applications. Architecture reparameterization obtains better tradeoffs and it is becoming an increasingly popular ingredient in modern convolutional neural networks. Nonetheless, its quantization performance is usually too poor to deploy (more than 20% top-1 accuracy drop on ImageNet) when INT8 inference is desired. In this paper, we dive into the underlying mechanism of this failure, where the original design inevitably enlarges quantization error. We propose a simple, robust, and effective remedy to have a quantization-friendly structure that also enjoys reparameterization benefits. Our method greatly bridges the gap between INT8 and FP32 accuracy for RepVGG. Without bells and whistles, the top-1 accuracy drop on ImageNet is reduced within 2% by standard post-training quantization. Moreover, our method also achieves similar FP32 performance as RepVGG. Extensive experiments on detection and semantic segmentation tasks verify its generalization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题