Melime：机器学习模型的有意义的本地解释

论文标题

Melime：机器学习模型的有意义的本地解释

MeLIME: Meaningful Local Explanation for Machine Learning Models

论文作者

Botari, Tiago, Hvilshøj, Frederik, Izbicki, Rafael, de Carvalho, Andre C. P. L. F.

论文摘要

大多数最先进的机器学习算法会诱导黑盒模型，从而阻止其在许多敏感域中的应用。因此，已经提出了许多用于解释机器学习模型的方法来解决此问题。在这项工作中，我们介绍了策略，以考虑用于培训黑盒模型的数据的分布，以改善本地解释。我们表明，与在不同类型的数据上运行的不同ML模型的其他技术相比，我们的方法（Melime）产生了更有意义的解释。 Melime概括了石灰方法，可以更灵活地扰动采样和使用不同局部可解释的模型。此外，我们将修改对当地可解释模型的标准培训算法进行了修改，从而促进了更强大的解释，甚至允许生产反事实示例。为了显示提出方法的优势，我们包括有关表格数据，图像和文本的实验；所有人都表现出改进的解释。特别是，梅利姆（Melime）在MNIST数据集上产生了更有意义的解释，而不是诸如guidedbackprop，smoothgrad和layer-layer-wise相关性传播等方法。 Melime可在https://github.com/tiagobotari/melime上找到。

Most state-of-the-art machine learning algorithms induce black-box models, preventing their application in many sensitive domains. Hence, many methodologies for explaining machine learning models have been proposed to address this problem. In this work, we introduce strategies to improve local explanations taking into account the distribution of the data used to train the black-box models. We show that our approach, MeLIME, produces more meaningful explanations compared to other techniques over different ML models, operating on various types of data. MeLIME generalizes the LIME method, allowing more flexible perturbation sampling and the use of different local interpretable models. Additionally, we introduce modifications to standard training algorithms of local interpretable models fostering more robust explanations, even allowing the production of counterfactual examples. To show the strengths of the proposed approach, we include experiments on tabular data, images, and text; all showing improved explanations. In particular, MeLIME generated more meaningful explanations on the MNIST dataset than methods such as GuidedBackprop, SmoothGrad, and Layer-wise Relevance Propagation. MeLIME is available on https://github.com/tiagobotari/melime.

下载PDF全文

下载文献需遵守相关版权规定

论文标题