魔鬼在通道中：互通道损失用于细粒度分类

论文标题

魔鬼在通道中：互通道损失用于细粒度分类

The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification

论文作者

Chang, Dongliang, Ding, Yifeng, Xie, Jiyang, Bhunia, Ayan Kumar, Li, Xiaoxu, Ma, Zhanyu, Wu, Ming, Guo, Jun, Song, Yi-Zhe

论文摘要

解决细粒图像分类的关键是找到与微妙的视觉特征相对应的歧视和局部区域。已经取得了长足的进步，专门设计的复杂网络是为了学习零件级别的歧视特征表示。在本文中，我们表明可以培养微妙的细节，而无需过度复杂的网络设计或培训机制 - 仅仅造成了全部损失。主要技巧在于我们如何尽早探究各个特征渠道，而不是从合并特征图开始的惯例。所提出的损失函数称为相互通道损失（MC-loss），由两个特定的渠道特异性组成：一个判别性成分和一个多样性成分。判别性组成部分通过一种新型的渠道注意机制迫使属于同一类别的所有特征通道具有歧视性。多样性组成部分还限制了通道，因此它们在空间方面变得相互排斥。因此，最终结果是一组特征通道，每个特征通道都反映了特定类别的不同局部歧视区域。 MC-loss可以端到端训练，而无需任何边界盒/零件注释，并且在推理过程中产生高度歧视区域。实验结果表明，当在公共基础网络之上实施时，我们的MC损失可以在所有四个细颗粒分类数据集（Cub-Birds，FGVC-Aircraft，Flowers-102和Stanford-Cars）上实现最先进的性能。消融研究进一步证明了与最近在两个不同基础网络上视觉分类的其他最新提议的通用损失相比，MC损失的优势。可在https://github.com/dongliangchang/mutual-channel-loss上找到代码

Key for solving fine-grained image categorization is finding discriminate and local regions that correspond to subtle visual traits. Great strides have been made, with complex networks designed specifically to learn part-level discriminate feature representations. In this paper, we show it is possible to cultivate subtle details without the need for overly complicated network designs or training mechanisms -- a single loss is all it takes. The main trick lies with how we delve into individual feature channels early on, as opposed to the convention of starting from a consolidated feature map. The proposed loss function, termed as mutual-channel loss (MC-Loss), consists of two channel-specific components: a discriminality component and a diversity component. The discriminality component forces all feature channels belonging to the same class to be discriminative, through a novel channel-wise attention mechanism. The diversity component additionally constraints channels so that they become mutually exclusive on spatial-wise. The end result is therefore a set of feature channels that each reflects different locally discriminative regions for a specific class. The MC-Loss can be trained end-to-end, without the need for any bounding-box/part annotations, and yields highly discriminative regions during inference. Experimental results show our MC-Loss when implemented on top of common base networks can achieve state-of-the-art performance on all four fine-grained categorization datasets (CUB-Birds, FGVC-Aircraft, Flowers-102, and Stanford-Cars). Ablative studies further demonstrate the superiority of MC-Loss when compared with other recently proposed general-purpose losses for visual classification, on two different base networks. Code available at https://github.com/dongliangchang/Mutual-Channel-Loss

下载PDF全文

下载文献需遵守相关版权规定

论文标题