小波注意CNN用于图像分类

论文标题

小波注意CNN用于图像分类

Wavelet-Attention CNN for Image Classification

论文作者

Xiangyu, Zhao

论文摘要

基于卷积神经网络（CNN）的特征学习方法成功地在图像分类任务中取得了巨大成就。但是，固有的噪声和其他一些因素可能会削弱卷积特征统计的有效性。在本文中，我们研究了频域中的离散小波变换（DWT），并设计了新的小波注意（WA）块，以仅在高频域中实现注意力。基于此，我们提出了用于图像分类的小波注意卷积神经网络（WA-CNN）。具体而言，WA-CNN分别将特征图分解为低频和高频组件，用于存储基本对象的结构以及详细的信息和噪声。然后，利用WA块，以捕获具有不同注意因素的高频域中的详细信息，但保留在低频域中的基本对象结构。 CIFAR-10和CIFAR-100数据集的实验结果表明，与其他相关网络相比，我们提出的WA-CNN在分类准确性方面取得了重大提高。具体而言，基于MobileNetV2骨架，WA-CNN在CIFAR-10基准测试基准方面可提高1.26％的TOP-1准确性，并在CIFAR-100基准测试中提高了1.54％的TOP-1准确性。

The feature learning methods based on convolutional neural network (CNN) have successfully produced tremendous achievements in image classification tasks. However, the inherent noise and some other factors may weaken the effectiveness of the convolutional feature statistics. In this paper, we investigate Discrete Wavelet Transform (DWT) in the frequency domain and design a new Wavelet-Attention (WA) block to only implement attention in the high-frequency domain. Based on this, we propose a Wavelet-Attention convolutional neural network (WA-CNN) for image classification. Specifically, WA-CNN decomposes the feature maps into low-frequency and high-frequency components for storing the structures of the basic objects, as well as the detailed information and noise, respectively. Then, the WA block is leveraged to capture the detailed information in the high-frequency domain with different attention factors but reserves the basic object structures in the low-frequency domain. Experimental results on CIFAR-10 and CIFAR-100 datasets show that our proposed WA-CNN achieves significant improvements in classification accuracy compared to other related networks. Specifically, based on MobileNetV2 backbones, WA-CNN achieves 1.26% Top-1 accuracy improvement on the CIFAR-10 benchmark and 1.54% Top-1 accuracy improvement on the CIFAR-100 benchmark.

下载PDF全文

下载文献需遵守相关版权规定

论文标题