多目标反事实解释

论文标题

多目标反事实解释

Multi-Objective Counterfactual Explanations

论文作者

Dandl, Susanne, Molnar, Christoph, Binder, Martin, Bischl, Bernd

论文摘要

反事实解释是最受欢迎的方法之一，可以通过以“假设场景”的形式提供可解释的黑匣子机器学习模型进行预测。大多数当前方法都优化了多个目标的崩溃，加权总和，这些总和很难平衡A-Priori。我们提出了多目标反事实（MOC）方法，该方法将反事实搜索转化为多目标优化问题。我们的方法不仅返回一套不同的反事实，在提议的目标之间具有不同的权衡，而且还保持着特征空间的多样性。这使更详细的事后分析可以促进更好的理解，并提供更多可行的用户响应以改变预测结果的选择。我们的方法也是模型 - 不平衡的方法，可用于数值和分类输入特征。我们展示了MOC在具体案例中的有用性，并将我们的方法与反事实解释的最新方法进行了比较。

Counterfactual explanations are one of the most popular methods to make predictions of black box machine learning models interpretable by providing explanations in the form of `what-if scenarios'. Most current approaches optimize a collapsed, weighted sum of multiple objectives, which are naturally difficult to balance a-priori. We propose the Multi-Objective Counterfactuals (MOC) method, which translates the counterfactual search into a multi-objective optimization problem. Our approach not only returns a diverse set of counterfactuals with different trade-offs between the proposed objectives, but also maintains diversity in feature space. This enables a more detailed post-hoc analysis to facilitate better understanding and also more options for actionable user responses to change the predicted outcome. Our approach is also model-agnostic and works for numerical and categorical input features. We show the usefulness of MOC in concrete cases and compare our approach with state-of-the-art methods for counterfactual explanations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题