论文标题
关于使用统计滚动模型的互联网流量预测的实证研究
An Empirical Study on Internet Traffic Prediction Using Statistical Rolling Model
论文作者
论文摘要
现实世界中的IP网络流量容易受到外部和内部因素的影响,例如新的Internet服务集成,流量迁移,Internet应用程序等。由于这些因素,实际的Internet流量是非线性的,并且使用统计模型进行未来预测的统计模型进行分析。在本文中,我们研究并评估了实际IP网络流量的不同统计预测模型的性能。并使用滚动预测技术显示了预测的显着改善。最初,通过分析流量特征并基于最低AKAIKE信息标准(AIC)来分析流量特征并实现网格搜索算法来确定相应预测模型的一组最佳超参数。然后,我们在自回旋的综合运动平均值(ARIMA),季节性Arima(Sarima),具有外源性因子(Sarimax)和Holt-Winter的自回旋综合运动平均值(Arima)和单步预测中进行了比较性能分析。与Arima相比,使用Sarima明确对我们交通的季节性已明确建模,这将滚动预测的平均平均百分比误差(MAPE)降低了4%以上(无法处理季节性)。我们使用Sarimax进一步改善了流量预测,以学习从原始流量中提取的不同的外源性因素,从而获得了最佳的滚动预测结果,MAPE为6.83%。最后,我们应用了指数平滑技术来处理Holt-Winter模型之后的流量变异性,该模型表现出比Arima更好的预测(MAPE少约1.5%)。与标准预测方法相比,使用真实的Internet服务提供商(ISP)流量数据将预测错误降低了50 \%。
Real-world IP network traffic is susceptible to external and internal factors such as new internet service integration, traffic migration, internet application, etc. Due to these factors, the actual internet traffic is non-linear and challenging to analyze using a statistical model for future prediction. In this paper, we investigated and evaluated the performance of different statistical prediction models for real IP network traffic; and showed a significant improvement in prediction using the rolling prediction technique. Initially, a set of best hyper-parameters for the corresponding prediction model is identified by analyzing the traffic characteristics and implementing a grid search algorithm based on the minimum Akaike Information Criterion (AIC). Then, we performed a comparative performance analysis among AutoRegressive Integrated Moving Average (ARIMA), Seasonal ARIMA (SARIMA), SARIMA with eXogenous factors (SARIMAX), and Holt-Winter for single-step prediction. The seasonality of our traffic has been explicitly modeled using SARIMA, which reduces the rolling prediction Mean Average Percentage Error (MAPE) by more than 4% compared to ARIMA (incapable of handling the seasonality). We further improved traffic prediction using SARIMAX to learn different exogenous factors extracted from the original traffic, which yielded the best rolling prediction results with a MAPE of 6.83%. Finally, we applied the exponential smoothing technique to handle the variability in traffic following the Holt-Winter model, which exhibited a better prediction than ARIMA (around 1.5% less MAPE). The rolling prediction technique reduced prediction error using real Internet Service Provider (ISP) traffic data by more than 50\% compared to the standard prediction method.