论文标题

数据科学家的预测评估:常见的陷阱和最佳实践

Forecast Evaluation for Data Scientists: Common Pitfalls and Best Practices

论文作者

Hewamalage, Hansika, Ackermann, Klaus, Bergmeir, Christoph

论文摘要

机器学习(ML)和深度学习方法(DL)方法越来越多地替换许多与重要决策活动有关的领域中的传统方法。 DL技术量身定制了针对特定任务(例如图像识别,信号处理或语音分析)的特定任务,并以许多改进的速度引入了。但是,对于预测的领域,ML社区中的当前状态可能是几年前自然语言处理和计算机视觉等其他领域的地方。预测领域主要由统计学家/经济学家培养。因此,相关概念不是普通ML从业者中的主流知识。与时间序列相关的不同非平稳性挑战了数据驱动的ML模型。然而,该域的最新趋势表明,随着大量时间序列的可用性,当正确处理相关陷阱时,ML技术在预测方面非常有能力。因此,在这项工作中,我们提供了整体预测过程中最重要的步骤之一的细节,即评估。这样,我们打算提供预测评估的信息,以适合ML的背景,作为弥合传统预测方法和最先进的ML技术之间知识差距的手段。我们详细介绍了时间序列的不同问题特征,例如非正常性和非平稳性,以及它们与预测评估中如何与常见陷阱相关联。预测评估中的最佳实践在不同的步骤(例如数据分配,错误计算,统计测试等)方面概述了。还提供了进一步的准则,根据当前数据集的特定特征,选择有效且适当的错误度量。

Machine Learning (ML) and Deep Learning (DL) methods are increasingly replacing traditional methods in many domains involved with important decision making activities. DL techniques tailor-made for specific tasks such as image recognition, signal processing, or speech analysis are being introduced at a fast pace with many improvements. However, for the domain of forecasting, the current state in the ML community is perhaps where other domains such as Natural Language Processing and Computer Vision were at several years ago. The field of forecasting has mainly been fostered by statisticians/econometricians; consequently the related concepts are not the mainstream knowledge among general ML practitioners. The different non-stationarities associated with time series challenge the data-driven ML models. Nevertheless, recent trends in the domain have shown that with the availability of massive amounts of time series, ML techniques are quite competent in forecasting, when related pitfalls are properly handled. Therefore, in this work we provide a tutorial-like compilation of the details of one of the most important steps in the overall forecasting process, namely the evaluation. This way, we intend to impart the information of forecast evaluation to fit the context of ML, as means of bridging the knowledge gap between traditional methods of forecasting and state-of-the-art ML techniques. We elaborate on the different problematic characteristics of time series such as non-normalities and non-stationarities and how they are associated with common pitfalls in forecast evaluation. Best practices in forecast evaluation are outlined with respect to the different steps such as data partitioning, error calculation, statistical testing, and others. Further guidelines are also provided along selecting valid and suitable error measures depending on the specific characteristics of the dataset at hand.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源