论文标题
蒸馏尖峰:尖峰神经网络中的知识蒸馏
Distilling Spikes: Knowledge Distillation in Spiking Neural Networks
论文作者
论文摘要
尖峰神经网络(SNN)是节能计算体系结构,与经典人工神经网络(ANN)不同,可以交换处理信息的峰值。因此,SNN更适合现实生活中的部署。但是,与ANN类似,SNN也受益于更深层次的体系结构,以提高性能。此外,像深度ANN一样,SNN的内存,计算和功率要求也随模型大小而增加,并且模型压缩成为必要。知识蒸馏是一种模型压缩技术,它可以将大型机器学习模型的学习转移到较小的模型中,而性能损失最小。在本文中,我们提出了针对图像分类任务的尖峰神经网络中知识蒸馏的技术。我们提出了将尖峰从较大的SNN(也称为教师网络)提炼到较小的方法(也称为学生网络)的方法,同时对分类准确性的影响很小。我们通过在提出新颖的蒸馏方法和损失功能的同时,通过在三个标准数据集上进行详细实验来证明该方法的有效性。我们还使用中级网络为SNN提供了一种多阶段知识蒸馏技术,以从学生网络中获得更高的性能。预计我们的方法将开辟新的途径,以在资源受限的硬件平台上部署高性能的大型SNN模型。
Spiking Neural Networks (SNN) are energy-efficient computing architectures that exchange spikes for processing information, unlike classical Artificial Neural Networks (ANN). Due to this, SNNs are better suited for real-life deployments. However, similar to ANNs, SNNs also benefit from deeper architectures to obtain improved performance. Furthermore, like the deep ANNs, the memory, compute and power requirements of SNNs also increase with model size, and model compression becomes a necessity. Knowledge distillation is a model compression technique that enables transferring the learning of a large machine learning model to a smaller model with minimal loss in performance. In this paper, we propose techniques for knowledge distillation in spiking neural networks for the task of image classification. We present ways to distill spikes from a larger SNN, also called the teacher network, to a smaller one, also called the student network, while minimally impacting the classification accuracy. We demonstrate the effectiveness of the proposed method with detailed experiments on three standard datasets while proposing novel distillation methodologies and loss functions. We also present a multi-stage knowledge distillation technique for SNNs using an intermediate network to obtain higher performance from the student network. Our approach is expected to open up new avenues for deploying high performing large SNN models on resource-constrained hardware platforms.