关于为深度学习生成合理的反事实和半事实解释

论文标题

关于为深度学习生成合理的反事实和半事实解释

On Generating Plausible Counterfactual and Semi-Factual Explanations for Deep Learning

论文作者

Kenny, Eoin M., Keane, Mark T.

论文摘要

人们越来越担心AI中最近取得的进展，尤其是关于深度学习模型的预测能力，将因未能正确解释其运营和产出而受到破坏。为了应对这种令人毛骨悚然的反事实解释，由于提议的计算心理和法律利益，在可解释的AI（XAI）中变得非常流行。但是，相反，人类通常可以解释其推理的一种类似方式的半成分，但令人惊讶地没有引起关注。大多数反事实方法都涉及表格数据，而不是图像数据，部分原因是后者的非交流性质使得难以定义的良好反事实。另外，在数据歧管上产生合理的外观解释是妨碍进步的另一个问题。本文推进了一种新的方法，用于为黑匣子CNN分类器生成合理的反事实（和半题材）。目前的方法称为合理的基于异常性的对比解释（零件），从反事实类的角度（因此可以确定反事实），修改了测试图像中的所有异常特征，以使其正常。两个受控的实验将此方法与文献中的其他实验进行了比较，这表明该作品不仅在几种措施上产生了最合理的反事实，而且还产生了最佳的半属性。

There is a growing concern that the recent progress made in AI, especially regarding the predictive competence of deep learning models, will be undermined by a failure to properly explain their operation and outputs. In response to this disquiet counterfactual explanations have become massively popular in eXplainable AI (XAI) due to their proposed computational psychological, and legal benefits. In contrast however, semifactuals, which are a similar way humans commonly explain their reasoning, have surprisingly received no attention. Most counterfactual methods address tabular rather than image data, partly due to the nondiscrete nature of the latter making good counterfactuals difficult to define. Additionally generating plausible looking explanations which lie on the data manifold is another issue which hampers progress. This paper advances a novel method for generating plausible counterfactuals (and semifactuals) for black box CNN classifiers doing computer vision. The present method, called PlausIble Exceptionality-based Contrastive Explanations (PIECE), modifies all exceptional features in a test image to be normal from the perspective of the counterfactual class (hence concretely defining a counterfactual). Two controlled experiments compare this method to others in the literature, showing that PIECE not only generates the most plausible counterfactuals on several measures, but also the best semifactuals.

下载PDF全文

下载文献需遵守相关版权规定

论文标题