通过低维结构的功能分解统一本地和全球模型的解释

论文标题

通过低维结构的功能分解统一本地和全球模型的解释

Unifying local and global model explanations by functional decomposition of low dimensional structures

论文作者

Hiabu, Munir, Meyer, Joseph T., Wright, Marvin N.

论文摘要

我们通过将其分解为任意顺序的主要和交互成分的总和来考虑回归或分类函数的全局表示。我们提出了一个新的识别约束，该约束允许提取介入的形状值和部分依赖图，从而统一本地和全局解释。通过我们提出的标识，特征的部分依赖图对应于主要效应项和截距。特征$ k $的介入形状值是主要组件的加权和所有交互成分，包括$ k $，其权重由组件的尺寸给予。这为诸如Shap值之类的本地解释带来了新的观点，这些观点以前仅由游戏理论激励。我们表明，分解可用于减少直接和间接偏置，以消除包括受保护特征的所有组件。最后，我们激发了一种新的特征重要性衡量标准。原则上，我们提出的功能分解可以应用于任何机器学习模型，但是精确的计算仅适用于低维结构或这些组合。我们为梯度增强的树（XGBoost）和随机种植的森林提供算法和有效的实施。进行的实验表明，我们的方法提供了有意义的解释并揭示了更高阶的相互作用。提出的方法在R软件包中实现，可在\ url {https://github.com/plantedml/glex}上获得。

We consider a global representation of a regression or classification function by decomposing it into the sum of main and interaction components of arbitrary order. We propose a new identification constraint that allows for the extraction of interventional SHAP values and partial dependence plots, thereby unifying local and global explanations. With our proposed identification, a feature's partial dependence plot corresponds to the main effect term plus the intercept. The interventional SHAP value of feature $k$ is a weighted sum of the main component and all interaction components that include $k$, with the weights given by the reciprocal of the component's dimension. This brings a new perspective to local explanations such as SHAP values which were previously motivated by game theory only. We show that the decomposition can be used to reduce direct and indirect bias by removing all components that include a protected feature. Lastly, we motivate a new measure of feature importance. In principle, our proposed functional decomposition can be applied to any machine learning model, but exact calculation is only feasible for low-dimensional structures or ensembles of those. We provide an algorithm and efficient implementation for gradient-boosted trees (xgboost) and random planted forest. Conducted experiments suggest that our method provides meaningful explanations and reveals interactions of higher orders. The proposed methods are implemented in an R package, available at \url{https://github.com/PlantedML/glex}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题