利润：一种针对4位Mobilenet模型的新型培训方法

论文标题

利润：一种针对4位Mobilenet模型的新型培训方法

PROFIT: A Novel Training Method for sub-4-bit MobileNet Models

论文作者

Park, Eunhyeok, Yoo, Sungjoo

论文摘要

由于对移动设备中更好的能源效率的需求不断增加，因此需要4位和较低的精度移动模型。在这项工作中，我们报告说，由重量量化引起的激活不稳定性（AIWQ）是移动网络下4位量化的关键障碍。为了减轻AIWQ问题，我们提出了一种新型的培训方法，称为渐进式冻结迭代培训（利润），该方法试图冻结其权重的层，其权重受到不稳定性问题的影响，而不稳定性问题比其他层强。我们还提出了一种可区分和统一的量化方法（DUQ）和负填充思想，以支持H-Swish等不对称激活函数。我们通过在ImageNet上量化Mobilenet-V1，V2和V3来评估所提出的方法，并报告4位量化提供可比的（超过1.48％TOP-1精度）的精度至全精度基线。在对Mobilenet-V3的3位量化的消融研究中，我们提出的方法以大幅度的幅度优于最先进的方法，占TOP-1准确性的12.86％。

4-bit and lower precision mobile models are required due to the ever-increasing demand for better energy efficiency in mobile devices. In this work, we report that the activation instability induced by weight quantization (AIWQ) is the key obstacle to sub-4-bit quantization of mobile networks. To alleviate the AIWQ problem, we propose a novel training method called PROgressive-Freezing Iterative Training (PROFIT), which attempts to freeze layers whose weights are affected by the instability problem stronger than the other layers. We also propose a differentiable and unified quantization method (DuQ) and a negative padding idea to support asymmetric activation functions such as h-swish. We evaluate the proposed methods by quantizing MobileNet-v1, v2, and v3 on ImageNet and report that 4-bit quantization offers comparable (within 1.48 % top-1 accuracy) accuracy to full precision baseline. In the ablation study of the 3-bit quantization of MobileNet-v3, our proposed method outperforms the state-of-the-art method by a large margin, 12.86 % of top-1 accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题