量化神经网络：表征和整体优化

论文标题

量化神经网络：表征和整体优化

Quantized Neural Networks: Characterization and Holistic Optimization

论文作者

Boo, Yoonho, Shin, Sungho, Sung, Wonyong

论文摘要

对低功率，高吞吐量和嵌入式应用是必需的量化深神经网络（QDNN）。先前的研究主要集中于开发用于量化给定模型的优化方法。但是，量化灵敏度取决于模型体系结构。因此，模型选择需要成为QDNN设计过程的一部分。同样，重量和激活量化的特征也很大不同。这项研究提出了一种整体方法，以优化QDNN，其中包含QDNN培训方法以及量化友好的体系结构设计。合成数据用于可视化重量和激活量化的影响。结果表明，更深的模型更容易进行激活量化，而更广泛的模型提高了对重量和激活量化的弹性。这项研究可以为更好地优化QDNN提供洞察力。

Quantized deep neural networks (QDNNs) are necessary for low-power, high throughput, and embedded applications. Previous studies mostly focused on developing optimization methods for the quantization of given models. However, quantization sensitivity depends on the model architecture. Therefore, the model selection needs to be a part of the QDNN design process. Also, the characteristics of weight and activation quantization are quite different. This study proposes a holistic approach for the optimization of QDNNs, which contains QDNN training methods as well as quantization-friendly architecture design. Synthesized data is used to visualize the effects of weight and activation quantization. The results indicate that deeper models are more prone to activation quantization, while wider models improve the resiliency to both weight and activation quantization. This study can provide insight into better optimization of QDNNs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题