论文标题
深度卷积神经网络的层次组稀疏正则化
Hierarchical Group Sparse Regularization for Deep Convolutional Neural Networks
论文作者
论文摘要
在深度神经网络(DNN)中,参数的数量通常很大,可以获得高学习的表现。因此,它花费大量内存和大量的计算资源,也导致过度拟合。众所周知,某些参数是冗余的,可以从网络中删除而不会降低性能。已经提出了许多稀疏的正规化标准来解决这个问题。在卷积神经网络(CNN)中,通常使用稀疏正规化来删除不必要的权重子集,例如过滤器或通道。当我们对连接到神经元作为组的权重的组稀疏正则化时,每个卷积滤波器都不会被视为正则化中的目标组。在本文中,我们介绍了层次组的概念来解决此问题,并提出了几个针对CNN的稀疏正规化标准。我们提出的分层组稀疏正规化可以治疗输入神经元的重量或输出神经元作为组和卷积滤波器作为同一组中的组,以修剪不必要的权重子集。结果,我们可以根据网络的结构和保持高性能的渠道数量更充分地修剪权重。在实验中,我们通过在公共数据集中与几个网络体系结构进行了深入的比较实验,研究了提出的稀疏正规化的有效性。代码可在github上找到:“ https://github.com/k-mitsuno/hierarchical-group-sparse-regularization”
In a deep neural network (DNN), the number of the parameters is usually huge to get high learning performances. For that reason, it costs a lot of memory and substantial computational resources, and also causes overfitting. It is known that some parameters are redundant and can be removed from the network without decreasing performance. Many sparse regularization criteria have been proposed to solve this problem. In a convolutional neural network (CNN), group sparse regularizations are often used to remove unnecessary subsets of the weights, such as filters or channels. When we apply a group sparse regularization for the weights connected to a neuron as a group, each convolution filter is not treated as a target group in the regularization. In this paper, we introduce the concept of hierarchical grouping to solve this problem, and we propose several hierarchical group sparse regularization criteria for CNNs. Our proposed the hierarchical group sparse regularization can treat the weight for the input-neuron or the output-neuron as a group and convolutional filter as a group in the same group to prune the unnecessary subsets of weights. As a result, we can prune the weights more adequately depending on the structure of the network and the number of channels keeping high performance. In the experiment, we investigate the effectiveness of the proposed sparse regularizations through intensive comparison experiments on public datasets with several network architectures. Code is available on GitHub: "https://github.com/K-Mitsuno/hierarchical-group-sparse-regularization"