论文标题
更深入地看着表格石灰
Looking Deeper into Tabular LIME
论文作者
论文摘要
在本文中,我们对在表格数据的情况下进行了详尽的理论分析。我们证明,在较大的样本限制中,可以按照算法参数的函数以及与黑盒模型相关的一些期望计算来计算表格石灰提供的可解释系数。当要解释的函数具有一些不错的代数结构(根据坐标的子集,线性,乘法或稀疏)时,我们的分析提供了对Lime提供的解释的有趣见解。这些可以应用于一系列机器学习模型,包括高斯内核或卡车随机森林。例如,对于线性函数,我们表明,石灰具有所需的属性,可以提供与函数系数成正比的解释,以解释并忽略该函数未使用的坐标来解释。对于基于分区的回归器,另一方面,我们表明石灰会产生可能提供误导性解释的不希望的人工制品。
In this paper, we present a thorough theoretical analysis of the default implementation of LIME in the case of tabular data. We prove that in the large sample limit, the interpretable coefficients provided by Tabular LIME can be computed in an explicit way as a function of the algorithm parameters and some expectation computations related to the black-box model. When the function to explain has some nice algebraic structure (linear, multiplicative, or sparsely depending on a subset of the coordinates), our analysis provides interesting insights into the explanations provided by LIME. These can be applied to a range of machine learning models including Gaussian kernels or CART random forests. As an example, for linear functions we show that LIME has the desirable property to provide explanations that are proportional to the coefficients of the function to explain and to ignore coordinates that are not used by the function to explain. For partition-based regressors, on the other side, we show that LIME produces undesired artifacts that may provide misleading explanations.