有效神经网络设计的结构性卷积

论文标题

有效神经网络设计的结构性卷积

Structured Convolutions for Efficient Neural Network Design

论文作者

Bhalgat, Yash, Zhang, Yizhe, Lin, Jamie, Porikli, Fatih

论文摘要

在这项工作中，我们通过利用卷积神经网络构建块的\ textit {隐式结构}中的冗余来解决模型效率。我们通过引入复合内核结构的一般定义来开始分析，该结构能够以有效的，缩放的，衡量的组件的形式执行卷积操作。作为特殊情况，我们提出了\ textIt {结构化卷积}，并表明这些杂货操作将卷积操作分解为汇总操作，然后是卷积的复杂性明显较低，权重较少。我们展示了如何将这种分解应用于2D和3D内核以及完全连接的层。此外，我们提出了结构性正则化损失，该损失促进了神经网络层，以利用这种所需的结构，以训练后，它们可以被可忽略不计的性能损失分解。通过将我们的方法应用于广泛的CNN体系结构，我们演示了重置的“结构化”版本，这些版本高达2 $ \ times $ himell和一个新的结构化 - 莫基内特V2，在ImageNet和CIFAR-10数据集中的准确性损失范围内，该版本更有效地效率更高。我们还在ImageNet和HRNET架构上显示了类似的效率网络的结构化版本，以进行城市景观数据集的语义分割。与现有的张量分解和通道修剪方法相比，我们的方法在复杂性降低方面表现良好或出色。

In this work, we tackle model efficiency by exploiting redundancy in the \textit{implicit structure} of the building blocks of convolutional neural networks. We start our analysis by introducing a general definition of Composite Kernel structures that enable the execution of convolution operations in the form of efficient, scaled, sum-pooling components. As its special case, we propose \textit{Structured Convolutions} and show that these allow decomposition of the convolution operation into a sum-pooling operation followed by a convolution with significantly lower complexity and fewer weights. We show how this decomposition can be applied to 2D and 3D kernels as well as the fully-connected layers. Furthermore, we present a Structural Regularization loss that promotes neural network layers to leverage on this desired structure in a way that, after training, they can be decomposed with negligible performance loss. By applying our method to a wide range of CNN architectures, we demonstrate "structured" versions of the ResNets that are up to 2$\times$ smaller and a new Structured-MobileNetV2 that is more efficient while staying within an accuracy loss of 1% on ImageNet and CIFAR-10 datasets. We also show similar structured versions of EfficientNet on ImageNet and HRNet architecture for semantic segmentation on the Cityscapes dataset. Our method performs equally well or superior in terms of the complexity reduction in comparison to the existing tensor decomposition and channel pruning methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题