论文标题
Stan:使用现实世界证据的大流行预测的时空注意网络
STAN: Spatio-Temporal Attention Network for Pandemic Prediction Using Real World Evidence
论文作者
论文摘要
目的:COVID-19大流行带来了许多需要立即关注的挑战。已经开发出各种流行病学和深度学习模型来预测Covid-19爆发,但是所有这些都有影响预测准确性和鲁棒性的局限性。我们的方法旨在解决这些局限性,并通过(1)使用来自不同县和编码当地疾病状况和医疗资源利用条件的患者的EHR数据进行早期,更准确的大流行暴发预测; (2)考虑位置之间的人口相似性和地理邻近性; (3)将流行传播动力学整合到深度学习模型中。材料和方法:我们提出了一个时空注意网络(Stan)进行大流行预测。它使用基于注意力的图形卷积网络来捕获地理和时间趋势,并预测未来固定天数的案例数量。我们还设计了一个基于物理定律的损失术语来增强长期预测。使用大量现实世界的患者数据和约翰·霍普金斯大学在所有美国县提供的大规模现实患者数据和开源Covid-19统计数据对Stan进行了测试。结果:在长期和短期预测上,Stan优于SIR和SEIR等流行病学建模方法,以及深度学习模型,与最佳基线预测模型相比,平均平方误差低高达87%。结论:通过使用现实世界中患者数据和地理数据的信息,Stan可以更好地捕获疾病状况和医疗资源利用信息,从而提供更准确的大流行建模。通过基于大流行法律的正则化,Stan还实现了良好的长期预测性能。
Objective: The COVID-19 pandemic has created many challenges that need immediate attention. Various epidemiological and deep learning models have been developed to predict the COVID-19 outbreak, but all have limitations that affect the accuracy and robustness of the predictions. Our method aims at addressing these limitations and making earlier and more accurate pandemic outbreak predictions by (1) using patients' EHR data from different counties and states that encode local disease status and medical resource utilization condition; (2) considering demographic similarity and geographical proximity between locations; and (3) integrating pandemic transmission dynamics into deep learning models. Materials and Methods: We proposed a spatio-temporal attention network (STAN) for pandemic prediction. It uses an attention-based graph convolutional network to capture geographical and temporal trends and predict the number of cases for a fixed number of days into the future. We also designed a physical law-based loss term for enhancing long-term prediction. STAN was tested using both massive real-world patient data and open source COVID-19 statistics provided by Johns Hopkins university across all U.S. counties. Results: STAN outperforms epidemiological modeling methods such as SIR and SEIR and deep learning models on both long-term and short-term predictions, achieving up to 87% lower mean squared error compared to the best baseline prediction model. Conclusions: By using information from real-world patient data and geographical data, STAN can better capture the disease status and medical resource utilization information and thus provides more accurate pandemic modeling. With pandemic transmission law based regularization, STAN also achieves good long-term prediction performance.