背包修剪，内部蒸馏

论文标题

背包修剪，内部蒸馏

Knapsack Pruning with Inner Distillation

论文作者

Aflalo, Yonathan, Noy, Asaf, Lin, Ming, Friedman, Itamar, Zelnik, Lihi

论文摘要

神经网络修剪降低了过度参数化网络的计算成本，以提高其效率。流行方法从$ \ ell_1 $ - norm稀疏到神经体系结构搜索（NAS）。在这项工作中，我们提出了一种新颖的修剪方法，该方法优化了修剪网络的最终准确性，并从过度参数化的父网络的内层中提取知识。为了启用这种方法，我们将网络修剪为背包问题，以优化神经元的重要性与其相关计算成本之间的权衡。然后，我们在维护网络的高级结构的同时修剪网络通道。修剪的网络使用其内部网络知识在父网络的监督下进行了微调，我们将这种技术称为内部知识蒸馏。我们的方法使用重新网络骨架在ImageNet，CIFAR-10和CIFAR-100上导致最新的修剪结果。为了修剪复杂的网络结构，例如具有跳过链接和深度卷积的卷积，我们提出了一种块分组方法来应对这些结构。通过此，我们生产的紧凑型体系结构与EfficityNet-B0和MobilenetV3相同，但精度更高，在Imagenet上分别为$ 1 \％$和0.3 \％$，在GPU上的运行时更快。

Neural network pruning reduces the computational cost of an over-parameterized network to improve its efficiency. Popular methods vary from $\ell_1$-norm sparsification to Neural Architecture Search (NAS). In this work, we propose a novel pruning method that optimizes the final accuracy of the pruned network and distills knowledge from the over-parameterized parent network's inner layers. To enable this approach, we formulate the network pruning as a Knapsack Problem which optimizes the trade-off between the importance of neurons and their associated computational cost. Then we prune the network channels while maintaining the high-level structure of the network. The pruned network is fine-tuned under the supervision of the parent network using its inner network knowledge, a technique we refer to as the Inner Knowledge Distillation. Our method leads to state-of-the-art pruning results on ImageNet, CIFAR-10 and CIFAR-100 using ResNet backbones. To prune complex network structures such as convolutions with skip-links and depth-wise convolutions, we propose a block grouping approach to cope with these structures. Through this we produce compact architectures with the same FLOPs as EfficientNet-B0 and MobileNetV3 but with higher accuracy, by $1\%$ and $0.3\%$ respectively on ImageNet, and faster runtime on GPU.

下载PDF全文

下载文献需遵守相关版权规定

论文标题