论文标题
分析卷积神经网络中的表示
Analyzing Representations inside Convolutional Neural Networks
论文作者
论文摘要
我们如何发现并简洁地总结神经网络学到的概念? Such a task is of great importance in applications of networks in areas of inference that involve classification, like medical diagnosis based on fMRI/x-ray etc. In this work, we propose a framework to categorize the concepts a network learns based on the way it clusters a set of input examples, clusters neurons based on the examples they activate for, and input features all in the same latent space.该框架是无监督的,可以在没有任何标签的输入功能的情况下工作,它只需要在每个输入示例中访问网络的内部激活,从而使其广泛适用。我们广泛评估了提出的方法,并证明它产生了Resnet-18在CIFAR-100数据集中学到的人类理解和连贯的概念。
How can we discover and succinctly summarize the concepts that a neural network has learned? Such a task is of great importance in applications of networks in areas of inference that involve classification, like medical diagnosis based on fMRI/x-ray etc. In this work, we propose a framework to categorize the concepts a network learns based on the way it clusters a set of input examples, clusters neurons based on the examples they activate for, and input features all in the same latent space. This framework is unsupervised and can work without any labels for input features, it only needs access to internal activations of the network for each input example, thereby making it widely applicable. We extensively evaluate the proposed method and demonstrate that it produces human-understandable and coherent concepts that a ResNet-18 has learned on the CIFAR-100 dataset.