论文标题
评估颜色对CNN在对象识别中的重要性
Assessing The Importance Of Colours For CNNs In Object Recognition
论文作者
论文摘要
人类严重依赖形状作为对象识别的主要提示。作为次要提示,在这方面,颜色和纹理也有益。卷积神经网络(CNN)是生物神经网络的模仿,已显示出具有冲突的特性。一些研究表明,CNN对纹理有偏见,而另一组研究表明,分类任务的形状偏差。但是,他们没有讨论颜色的作用,暗示其在对象识别任务中可能谦虚的作用。在本文中,我们从经验上研究了颜色在对象识别中CNN中的重要性。我们能够证明CNN在做出预测的同时经常严重依赖颜色信息。我们的结果表明,对颜色的依赖程度往往会因一个数据集而异。此外,如果从头开始训练,网络往往会更多地依靠颜色。预训练可以使模型的颜色依赖性较小。为了促进这些发现,我们遵循通常在理解颜色在对象识别人类中的作用中的框架。我们评估了一个训练的模型,该模型具有一致的图像(图像原始颜色,例如红色草莓),灰色图像和不一致的图像(以不自然的颜色为例,例如蓝色草莓)。在这些不同的样式下,我们测量和分析网络的预测性能(TOP-1精度)。我们在实验中利用了监督图像分类和细粒图像分类的标准数据集。
Humans rely heavily on shapes as a primary cue for object recognition. As secondary cues, colours and textures are also beneficial in this regard. Convolutional neural networks (CNNs), an imitation of biological neural networks, have been shown to exhibit conflicting properties. Some studies indicate that CNNs are biased towards textures whereas, another set of studies suggests shape bias for a classification task. However, they do not discuss the role of colours, implying its possible humble role in the task of object recognition. In this paper, we empirically investigate the importance of colours in object recognition for CNNs. We are able to demonstrate that CNNs often rely heavily on colour information while making a prediction. Our results show that the degree of dependency on colours tend to vary from one dataset to another. Moreover, networks tend to rely more on colours if trained from scratch. Pre-training can allow the model to be less colour dependent. To facilitate these findings, we follow the framework often deployed in understanding role of colours in object recognition for humans. We evaluate a model trained with congruent images (images in original colours eg. red strawberries) on congruent, greyscale, and incongruent images (images in unnatural colours eg. blue strawberries). We measure and analyse network's predictive performance (top-1 accuracy) under these different stylisations. We utilise standard datasets of supervised image classification and fine-grained image classification in our experiments.