在多元时间序列数据上的机器学习的反事实解释

论文标题

在多元时间序列数据上的机器学习的反事实解释

Counterfactual Explanations for Machine Learning on Multivariate Time Series Data

论文作者

Ates, Emre, Aksar, Burak, Leung, Vitus J., Coskun, Ayse K.

论文摘要

在多元时间序列数据上应用机器学习（ML）在许多应用程序域（包括计算机系统管理）中越来越受欢迎。例如，最近的高性能计算（HPC）研究提出了多种以多元时间序列形式使用系统遥测数据的ML框架，以检测性能变化，执行智能调度或节点分配并提高系统安全性。这些ML框架采用的常见障碍包括缺乏用户信任和调试困难。这些障碍需要克服，以使生产系统中的ML框架广泛采用。为了应对这一挑战，本文提出了一种新颖的解释性技术，用于为使用多元时间序列数据的有监督的ML框架提供反事实解释。所提出的方法在几个不同的ML框架和忠诚度和鲁棒性等指标中的几个不同ML框架和数据集上的最先进的解释性方法优于最先进的解释性方法。该论文还展示了如何使用所提出的方法来调试ML框架并更好地了解HPC系统遥测数据。

Applying machine learning (ML) on multivariate time series data has growing popularity in many application domains, including in computer system management. For example, recent high performance computing (HPC) research proposes a variety of ML frameworks that use system telemetry data in the form of multivariate time series so as to detect performance variations, perform intelligent scheduling or node allocation, and improve system security. Common barriers for adoption for these ML frameworks include the lack of user trust and the difficulty of debugging. These barriers need to be overcome to enable the widespread adoption of ML frameworks in production systems. To address this challenge, this paper proposes a novel explainability technique for providing counterfactual explanations for supervised ML frameworks that use multivariate time series data. The proposed method outperforms state-of-the-art explainability methods on several different ML frameworks and data sets in metrics such as faithfulness and robustness. The paper also demonstrates how the proposed method can be used to debug ML frameworks and gain a better understanding of HPC system telemetry data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题