反事实解释和机器学习算法回复：评论

论文标题

反事实解释和机器学习算法回复：评论

Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review

论文作者

Verma, Sahil, Boonsanong, Varich, Hoang, Minh, Hines, Keegan E., Dickerson, John P., Shah, Chirag

论文摘要

机器学习在许多部署的决策系统中都起着作用，通常以人类利益相关者难以理解的方式。以人为理解的方式解释机器学习模型的输入和输出之间的关系对于基于机器学习的系统的发展至关重要。一项新兴的研究机构旨在定义机器学习中解释性的目标和方法。在本文中，我们试图审查和分类针对反事实解释的研究，这是一种特定的解释，该解释提供了可能发生的事情之间的联系，而该解释的意见是以特定方式更改了对模型的输入。机器学习中反事实解释性的现代方法吸引了许多国家与已建立的法律学说的联系，使它们吸引了财务和医疗保健等高影响力地区的现场系统。因此，我们设计了一种具有反事实解释算法的理想特性的标题，并全面评估了目前提出的所有算法对该标语。我们的专栏可轻松比较和理解不同方法的优势和缺点，并作为该领域的主要研究主题的介绍。我们还确定了差距，并讨论反事实解释性空间中有希望的研究方向。

Machine learning plays a role in many deployed decision systems, often in ways that are difficult or impossible to understand by human stakeholders. Explaining, in a human-understandable way, the relationship between the input and output of machine learning models is essential to the development of trustworthy machine learning based systems. A burgeoning body of research seeks to define the goals and methods of explainability in machine learning. In this paper, we seek to review and categorize research on counterfactual explanations, a specific class of explanation that provides a link between what could have happened had input to a model been changed in a particular way. Modern approaches to counterfactual explainability in machine learning draw connections to the established legal doctrine in many countries, making them appealing to fielded systems in high-impact areas such as finance and healthcare. Thus, we design a rubric with desirable properties of counterfactual explanation algorithms and comprehensively evaluate all currently proposed algorithms against that rubric. Our rubric provides easy comparison and comprehension of the advantages and disadvantages of different approaches and serves as an introduction to major research themes in this field. We also identify gaps and discuss promising research directions in the space of counterfactual explainability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题