神经网络中的解释范例

论文标题

神经网络中的解释范例

Explanatory Paradigms in Neural Networks

论文作者

AlRegib, Ghassan, Prabhushankar, Mohit

论文摘要

在本文中，我们通过考虑解释作为基于抽象推理的问题的答案来对神经网络中解释性的研究进行跃升。通过$ p $作为神经网络的预测，这些问题是“为什么p？”，``如果不是p？'，而``为什么p，而不是q？''对于给定的对比预测$ Q $。这些问题的答案分别观察到相关性，观察到的反事实和观察到的对比解释。这些解释共同构成了绑架推理计划。我们将三种解释性方案称为观察到的解释范式。观察到的一词是指事后解释性的具体案例，当解释性技术解释了训练有素的神经网络后的决定$ p $已做出了$ p $。通过基于绑架推理的问题查看解释的主要优势是，解释可以用作做出决策的原因。事后解释性领域，以前仅是合理的决策，通过参与决策过程并提供有限但相关和上下文干预措施而变得活跃。本文的贡献是：（$ i $）实现解释为推理范例，（$ ii $），提供了对观察到的解释及其完整性的概率定义，（$ iii $）创建了评估解释的分类法，以及（$ iv $）定位基于梯度的完整解释的复制性和复制性$ $ repos $ repos $ repos $ repos $ repos $（可在https://github.com/olivesgatech/explanation-paradigms上找到。

In this article, we present a leap-forward expansion to the study of explainability in neural networks by considering explanations as answers to abstract reasoning-based questions. With $P$ as the prediction from a neural network, these questions are `Why P?', `What if not P?', and `Why P, rather than Q?' for a given contrast prediction $Q$. The answers to these questions are observed correlations, observed counterfactuals, and observed contrastive explanations respectively. Together, these explanations constitute the abductive reasoning scheme. We term the three explanatory schemes as observed explanatory paradigms. The term observed refers to the specific case of post-hoc explainability, when an explanatory technique explains the decision $P$ after a trained neural network has made the decision $P$. The primary advantage of viewing explanations through the lens of abductive reasoning-based questions is that explanations can be used as reasons while making decisions. The post-hoc field of explainability, that previously only justified decisions, becomes active by being involved in the decision making process and providing limited, but relevant and contextual interventions. The contributions of this article are: ($i$) realizing explanations as reasoning paradigms, ($ii$) providing a probabilistic definition of observed explanations and their completeness, ($iii$) creating a taxonomy for evaluation of explanations, and ($iv$) positioning gradient-based complete explanainability's replicability and reproducibility across multiple applications and data modalities, ($v$) code repositories, publicly available at https://github.com/olivesgatech/Explanatory-Paradigms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题