论文标题
使用超市零售记录预测季节性流感
Predicting seasonal influenza using supermarket retail records
论文作者
论文摘要
流行病学数据的可用性增加,新颖的数字数据流以及强大的机器学习方法的兴起产生了对实时流行病预测系统的研究活动。在本文中,我们建议使用新型数据源,即零售市场数据来改善季节性流感预测。具体来说,我们将超市零售数据视为流感的代理信号,通过识别前哨篮子,即由一批选定客户的人群一起购买的产品。我们开发了一个现状和预测框架,该框架可为意大利的流感发病率提供估计,最多可达4周。我们利用支持矢量回归(SVR)模型来产生季节性流感发生率的预测。我们的预测表现优于基线自回旋模型和基于产品购买的第二个基线。结果表明,将零售市场数据纳入预测模型的价值,可作为代理,可用于实时分析流行病。
Increased availability of epidemiological data, novel digital data streams, and the rise of powerful machine learning approaches have generated a surge of research activity on real-time epidemic forecast systems. In this paper, we propose the use of a novel data source, namely retail market data to improve seasonal influenza forecasting. Specifically, we consider supermarket retail data as a proxy signal for influenza, through the identification of sentinel baskets, i.e., products bought together by a population of selected customers. We develop a nowcasting and forecasting framework that provides estimates for influenza incidence in Italy up to 4 weeks ahead. We make use of the Support Vector Regression (SVR) model to produce the predictions of seasonal flu incidence. Our predictions outperform both a baseline autoregressive model and a second baseline based on product purchases. The results show quantitatively the value of incorporating retail market data in forecasting models, acting as a proxy that can be used for the real-time analysis of epidemics.