通过非参数bootstrap进行监测模型恶化，可解释的不确定性估算

论文标题

通过非参数bootstrap进行监测模型恶化，可解释的不确定性估算

Monitoring Model Deterioration with Explainable Uncertainty Estimation via Non-parametric Bootstrap

论文作者

Mougan, Carlos, Nielsen, Dan Saattrup

论文摘要

监视机器学习模型一旦部署就具有挑战性。决定何时无法触及的标签数据时，决定何时在实际情况下重新培训模型，并且监视性能指标变得不可行，这更具挑战性。在这项工作中，我们使用非参数自举的不确定性估计值和SHAP值来提供可解释的不确定性估计，作为一种旨在监视部署环境中机器学习模型恶化的技术，并确定目标标签何时不可用。经典方法纯粹旨在检测分布变化，这可能导致误报，因为尽管数据分布发生了变化，但该模型并未恶化。为了估计模型不确定性，我们使用新型的自举方法构建预测间隔，该方法改进了Kumar＆Srivastava（2012）的工作。我们表明，我们的模型变质检测系统以及我们的不确定性估计方法的性能都比当前的最新技术更好。最后，我们使用可解释的AI技术来了解模型恶化的驱动因素。我们发布了一个开源的Python软件包，该软件包疑问，该软件包实现了我们提出的方法，以及用于复制实验的代码。

Monitoring machine learning models once they are deployed is challenging. It is even more challenging to decide when to retrain models in real-case scenarios when labeled data is beyond reach, and monitoring performance metrics becomes unfeasible. In this work, we use non-parametric bootstrapped uncertainty estimates and SHAP values to provide explainable uncertainty estimation as a technique that aims to monitor the deterioration of machine learning models in deployment environments, as well as determine the source of model deterioration when target labels are not available. Classical methods are purely aimed at detecting distribution shift, which can lead to false positives in the sense that the model has not deteriorated despite a shift in the data distribution. To estimate model uncertainty we construct prediction intervals using a novel bootstrap method, which improves upon the work of Kumar & Srivastava (2012). We show that both our model deterioration detection system as well as our uncertainty estimation method achieve better performance than the current state-of-the-art. Finally, we use explainable AI techniques to gain an understanding of the drivers of model deterioration. We release an open source Python package, doubt, which implements our proposed methods, as well as the code used to reproduce our experiments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题