论文标题

通过视觉分析诊断概念漂移

Diagnosing Concept Drift with Visual Analytics

论文作者

Yang, Weikai, Li, Zhen, Liu, Mengchen, Lu, Yafeng, Cao, Kelei, Maciejewski, Ross, Liu, Shixia

论文摘要

概念漂移是一种现象,其中数据流的分布会随着时间的流逝而变化,从而导致建立在历史数据上的预测模型变得不准确。尽管已经开发了各种自动化方法来确定何时发生概念漂移,但在检测到漂移时需要理解和纠正模型的分析师的支持有限。在本文中,我们提出了一种视觉分析方法Driftvis,以支持模型构建者和分析师在识别和校正流数据中的概念漂移。 Driftvis将基于分布的漂移检测方法与流散点图结合在一起,以支持数据流的分布变化引起的漂移分析,并探索这些变化对模型准确性的影响。已经进行了定量实验和两项有关天气预测和文本分类的案例研究,以证明我们提出的工具,并说明如何使用视觉分析来支持概念漂移的检测,检查和校正。

Concept drift is a phenomenon in which the distribution of a data stream changes over time in unforeseen ways, causing prediction models built on historical data to become inaccurate. While a variety of automated methods have been developed to identify when concept drift occurs, there is limited support for analysts who need to understand and correct their models when drift is detected. In this paper, we present a visual analytics method, DriftVis, to support model builders and analysts in the identification and correction of concept drift in streaming data. DriftVis combines a distribution-based drift detection method with a streaming scatterplot to support the analysis of drift caused by the distribution changes of data streams and to explore the impact of these changes on the model's accuracy. A quantitative experiment and two case studies on weather prediction and text classification have been conducted to demonstrate our proposed tool and illustrate how visual analytics can be used to support the detection, examination, and correction of concept drift.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源