论文标题

在美国策划COVID-19数据存储库和预测县级死亡人数

Curating a COVID-19 data repository and forecasting county-level death counts in the United States

论文作者

Altieri, Nick, Barter, Rebecca L., Duncan, James, Dwivedi, Raaz, Kumbier, Karl, Li, Xiao, Netzorg, Robert, Park, Briton, Singh, Chandan, Tan, Yan Shuo, Tang, Tiffany, Wang, Yu, Zhang, Chao, Yu, Bin

论文摘要

随着COVID-19疫情的发展,准确的预测在为政策决策提供信息中继续发挥着极为重要的作用。在本文中,我们介绍了一个大型数据存储库,其中包含来自多种来源的COVID-19信息。我们使用这些数据来开发预测和相应的预测间隔,以在美国县级别的COVID-19累积死亡人数短期轨道轨迹上,最长达两周。使用从2020年1月22日至6月20日的数据,我们使用结合技术开发和组合了多个预测,从而导致合奏,我们称为联合线性和指数预测指标(CLEP)。我们的个人预测因素包括县特定的指数和线性预测指标,这是一个共享的指数预测指标,该预测指标跨县汇总数据,一种扩展的共享指数预测指标,使用相邻县的数据以及基于人口统计数据的共享指数预测指标。我们使用过去五天开始的预测错误来评估死亡预测的不确定性,从而导致通常可施加的预测间隔,最大(绝对)误差预测间隔(MEPI)。当整个县平均预测未来两周的累积死亡人数时,MEPI的覆盖率超过94%。目前,非营利组织Response4Life正在使用我们的预测来确定单个医院的医疗供应需求,并直接为全国医疗用品的分配做出了贡献。我们希望我们在https://covidseverity.com上的预测和数据存储库可以帮助指导必要的县特定的决策,并帮助县为他们与Covid-19的持续斗争做准备。

As the COVID-19 outbreak evolves, accurate forecasting continues to play an extremely important role in informing policy decisions. In this paper, we present our continuous curation of a large data repository containing COVID-19 information from a range of sources. We use this data to develop predictions and corresponding prediction intervals for the short-term trajectory of COVID-19 cumulative death counts at the county-level in the United States up to two weeks ahead. Using data from January 22 to June 20, 2020, we develop and combine multiple forecasts using ensembling techniques, resulting in an ensemble we refer to as Combined Linear and Exponential Predictors (CLEP). Our individual predictors include county-specific exponential and linear predictors, a shared exponential predictor that pools data together across counties, an expanded shared exponential predictor that uses data from neighboring counties, and a demographics-based shared exponential predictor. We use prediction errors from the past five days to assess the uncertainty of our death predictions, resulting in generally-applicable prediction intervals, Maximum (absolute) Error Prediction Intervals (MEPI). MEPI achieves a coverage rate of more than 94% when averaged across counties for predicting cumulative recorded death counts two weeks in the future. Our forecasts are currently being used by the non-profit organization, Response4Life, to determine the medical supply need for individual hospitals and have directly contributed to the distribution of medical supplies across the country. We hope that our forecasts and data repository at https://covidseverity.com can help guide necessary county-specific decision-making and help counties prepare for their continued fight against COVID-19.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源