论文标题
屈肌:可训练的分数量化
FleXOR: Trainable Fractional Quantization
论文作者
论文摘要
基于二进制代码的量化正在引起人们的注意,因为每个量化位均可直接用于计算,而无需使用查找表进行取消化。但是,先前的尝试仅允许整数量化位数,最终限制了搜索空间的压缩比和准确性。在本文中,我们提出了一种加密算法/体系结构来压缩量化的权重,以实现每个重量的分数。推理期间的解密是由添加到神经网络模型中的数字XOR-GATE网络实现的,而XOR门的描述是通过利用$ \ tanh(x)$进行向后传播来启用梯度计算的。我们使用MNIST,CIFAR-10和Imagenet进行实验,以表明插入XOR门可以通过训练学习量化/加密的位决策,甚至对于分数子1位权重即使获得了很高的精度。结果,与二进制神经网络相比,我们提出的方法产生的大小和较高的模型精度。
Quantization based on the binary codes is gaining attention because each quantized bit can be directly utilized for computations without dequantization using look-up tables. Previous attempts, however, only allow for integer numbers of quantization bits, which ends up restricting the search space for compression ratio and accuracy. In this paper, we propose an encryption algorithm/architecture to compress quantized weights so as to achieve fractional numbers of bits per weight. Decryption during inference is implemented by digital XOR-gate networks added into the neural network model while XOR gates are described by utilizing $\tanh(x)$ for backward propagation to enable gradient calculations. We perform experiments using MNIST, CIFAR-10, and ImageNet to show that inserting XOR gates learns quantization/encrypted bit decisions through training and obtains high accuracy even for fractional sub 1-bit weights. As a result, our proposed method yields smaller size and higher model accuracy compared to binary neural networks.