保险丝：使用每像素显着性可视化来解释用于图像融合的神经网络

论文标题

保险丝：使用每像素显着性可视化来解释用于图像融合的神经网络

FuseVis: Interpreting neural networks for image fusion using per-pixel saliency visualization

论文作者

Kumar, Nishant, Gumhold, Stefan

论文摘要

图像融合有助于合并两个或多个图像，以构建一个更具翔实的单个融合图像。最近，基于学习的基于学习的卷积神经网络（CNN）已用于不同类型的图像融合任务，例如医学图像融合，用于自主驾驶的红外可见图像融合以及卫星成像的多对焦和多曝光图像融合。但是，由于没有可用地面图，分析这些CNN对图像融合任务的可靠性是一项挑战。这导致使用了多种模型架构和优化功能，从而产生了截然不同的融合结果。此外，由于这种神经网络的高度不透明性质，很难解释其融合结果背后的内部力学。为了克服这些挑战，我们提出了一种新型的实时可视化工具，名为Fusevis，最终用户可以通过该工具计算每个像素显着图，以检查输入图像像素对融合图像的每个像素的影响。我们在医学图像对上培训了几个基于图像融合的CNN，然后使用Fusevis工具，通过解释从每种融合方法的显着性图来进行了特定临床应用的案例研究。我们特别看到了每个输入图像对融合图像预测的相对影响，并表明一些评估的图像融合方法更适合特定的临床应用。据我们所知，目前尚无对图像融合的神经网络的可视化分析方法。因此，这项工作为提高深融合网络的解释性开辟了一个新的研究方向。 Fusevis工具也可以在其他基于神经网络的图像处理应用程序中进行调整，以使其可解释。

Image fusion helps in merging two or more images to construct a more informative single fused image. Recently, unsupervised learning based convolutional neural networks (CNN) have been utilized for different types of image fusion tasks such as medical image fusion, infrared-visible image fusion for autonomous driving as well as multi-focus and multi-exposure image fusion for satellite imagery. However, it is challenging to analyze the reliability of these CNNs for the image fusion tasks since no groundtruth is available. This led to the use of a wide variety of model architectures and optimization functions yielding quite different fusion results. Additionally, due to the highly opaque nature of such neural networks, it is difficult to explain the internal mechanics behind its fusion results. To overcome these challenges, we present a novel real-time visualization tool, named FuseVis, with which the end-user can compute per-pixel saliency maps that examine the influence of the input image pixels on each pixel of the fused image. We trained several image fusion based CNNs on medical image pairs and then using our FuseVis tool, we performed case studies on a specific clinical application by interpreting the saliency maps from each of the fusion methods. We specifically visualized the relative influence of each input image on the predictions of the fused image and showed that some of the evaluated image fusion methods are better suited for the specific clinical application. To the best of our knowledge, currently, there is no approach for visual analysis of neural networks for image fusion. Therefore, this work opens up a new research direction to improve the interpretability of deep fusion networks. The FuseVis tool can also be adapted in other deep neural network based image processing applications to make them interpretable.

下载PDF全文

下载文献需遵守相关版权规定

论文标题