通过因果学习解释黑框预测算法的行为

论文标题

通过因果学习解释黑框预测算法的行为

Explaining the Behavior of Black-Box Prediction Algorithms with Causal Learning

论文作者

Sani, Numair, Malinsky, Daniel, Shpitser, Ilya

论文摘要

黑盒预测模型的事后解释性的因果方法（例如，在图像像素数据上训练的深度神经网络）变得越来越流行。但是，现有方法有两个重要的缺点：（i）“解释单位”是对相关预测模型的微观输入，例如图像像素，而不是可解释的宏观特征，这些特征更有用，这些特征更有用，这些特征对于理解如何可能改变算法的行为和（ii）对现有的构建的构成单位的持续性，并且在某些情况下进行了模型，并且在某些方面进行了模型，并且（ii）构成了一个模型，并且是在某些方面存在模型，那么该功能是在某些方面存在的，并且是在某些方面的构图，并且是在某些方面的构图。宏级变量。我们的重点是分析师无法访问目标预测算法的内部运作的重要环境，而仅对响应特定输入进行查询的模型输出的能力。为了在这种环境中提供因果解释，我们建议学习因果图形表示，以使特征之间有任意的混淆。我们证明所得图可以区分可解释的特征，这些特征会影响模型预测与仅与模型预测相关的模型预测。我们的方法是由因果解释的反事实理论激励的，其中良好的解释指出了在干预主义意义上是“差异者”的因素。

Causal approaches to post-hoc explainability for black-box prediction models (e.g., deep neural networks trained on image pixel data) have become increasingly popular. However, existing approaches have two important shortcomings: (i) the "explanatory units" are micro-level inputs into the relevant prediction model, e.g., image pixels, rather than interpretable macro-level features that are more useful for understanding how to possibly change the algorithm's behavior, and (ii) existing approaches assume there exists no unmeasured confounding between features and target model predictions, which fails to hold when the explanatory units are macro-level variables. Our focus is on the important setting where the analyst has no access to the inner workings of the target prediction algorithm, rather only the ability to query the output of the model in response to a particular input. To provide causal explanations in such a setting, we propose to learn causal graphical representations that allow for arbitrary unmeasured confounding among features. We demonstrate the resulting graph can differentiate between interpretable features that causally influence model predictions versus those that are merely associated with model predictions due to confounding. Our approach is motivated by a counterfactual theory of causal explanation wherein good explanations point to factors that are "difference-makers" in an interventionist sense.

下载PDF全文

下载文献需遵守相关版权规定

论文标题