PSCONV：将金字塔挤压到一个紧凑的多尺度卷积层

论文标题

PSCONV：将金字塔挤压到一个紧凑的多尺度卷积层

PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer

论文作者

Li, Duo, Yao, Anbang, Chen, Qifeng

论文摘要

尽管具有很强的建模能力，但卷积神经网络（CNN）通常对尺度敏感。为了增强CNN对扩展差异的鲁棒性，来自不同层或过滤器的多尺度特征融合引起了现有解决方案的极大关注，而更颗粒的内核空间则被忽略了。我们通过利用更精细的粒度来利用多尺度特征来弥合这种遗憾。提出的卷积操作（称为Poly尺度卷积（PSCONV））混合了一系列扩张速率，并在每个过滤器的单个卷积核中巧妙地分配了它们，以单个卷积层分配。具体而言，扩张速率沿过滤器的输入和输出通道的轴周期性变化，以整齐的样式在各种尺度上汇总特征。 PSCONV可能是许多流行的CNN骨架中的香草卷积的替换，可以在不引入其他参数和计算复杂性的情况下更好地表示学习。 ImageNet和MS Coco基准的全面实验验证了PSCONV的出色性能。代码和型号可在https://github.com/d-li14/psconv上找到。

Despite their strong modeling capacities, Convolutional Neural Networks (CNNs) are often scale-sensitive. For enhancing the robustness of CNNs to scale variance, multi-scale feature fusion from different layers or filters attracts great attention among existing solutions, while the more granular kernel space is overlooked. We bridge this regret by exploiting multi-scale features in a finer granularity. The proposed convolution operation, named Poly-Scale Convolution (PSConv), mixes up a spectrum of dilation rates and tactfully allocate them in the individual convolutional kernels of each filter regarding a single convolutional layer. Specifically, dilation rates vary cyclically along the axes of input and output channels of the filters, aggregating features over a wide range of scales in a neat style. PSConv could be a drop-in replacement of the vanilla convolution in many prevailing CNN backbones, allowing better representation learning without introducing additional parameters and computational complexities. Comprehensive experiments on the ImageNet and MS COCO benchmarks validate the superior performance of PSConv. Code and models are available at https://github.com/d-li14/PSConv.

下载PDF全文

下载文献需遵守相关版权规定

论文标题