论文标题
使用无监督聚类来解释深层神经网络
Explaining Deep Neural Networks using Unsupervised Clustering
论文作者
论文摘要
我们提出了一种新颖的方法,可以通过将其将其提炼成无监督的聚类来解释训练有素的深神经网络(DNN)。我们的方法可以灵活地应用于DNN体系结构的任何层的任何子集,并可以合并低级和高级信息。在图像数据集中对预先训练的DNN进行的,我们证明了我们在找到类似训练样本的方法中的强度,并阐明了DNNS的概念。通过用户研究,我们表明我们的模型可以提高用户对模型预测的信任。
We propose a novel method to explain trained deep neural networks (DNNs), by distilling them into surrogate models using unsupervised clustering. Our method can be applied flexibly to any subset of layers of a DNN architecture and can incorporate low-level and high-level information. On image datasets given pre-trained DNNs, we demonstrate the strength of our method in finding similar training samples, and shedding light on the concepts the DNNs base their decisions on. Via user studies, we show that our model can improve the user trust in model's prediction.