论文标题
从小型数据集中找到准确的早期预测模型:2019-NCOV小说冠状病毒爆发案例
Finding an Accurate Early Forecasting Model from Small Dataset: A Case of 2019-nCoV Novel Coronavirus Outbreak
论文作者
论文摘要
流行病是传染病的迅速传播,威胁着许多生命和经济损害。重要的是要预言流行生活,以便决定及时和补救措施。这些措施包括关闭边界,学校,暂停社区服务和通勤者。恢复此类宵禁取决于暴发的动量及其衰变速度。能够准确预测流行病的命运是一项极为重要但艰巨的任务。由于对新型疾病的了解有限,涉及的高度不确定性以及影响新病毒广泛的复杂社会政治因素,任何预测都不是可靠的。另一个因素是不足的可用数据。当流行病刚刚开始时,数据样本通常很少。只有很少的培训样本,找到一个预测模型,该模型在最佳努力中提供预测是机器学习的巨大挑战。过去,提出了三种流行的方法,包括1)增加现有数据,2)使用面板选择从几种模型中选择最佳预测模型,以及3)对单个预测模型的参数进行微调以提高可能的准确性。在本文中,提出了一种从小数据集中包含这三个数据挖掘的方法的方法...
Epidemic is a rapid and wide spread of infectious disease threatening many lives and economy damages. It is important to fore-tell the epidemic lifetime so to decide on timely and remedic actions. These measures include closing borders, schools, suspending community services and commuters. Resuming such curfews depends on the momentum of the outbreak and its rate of decay. Being able to accurately forecast the fate of an epidemic is an extremely important but difficult task. Due to limited knowledge of the novel disease, the high uncertainty involved and the complex societal-political factors that influence the widespread of the new virus, any forecast is anything but reliable. Another factor is the insufficient amount of available data. Data samples are often scarce when an epidemic just started. With only few training samples on hand, finding a forecasting model which offers forecast at the best efforts is a big challenge in machine learning. In the past, three popular methods have been proposed, they include 1) augmenting the existing little data, 2) using a panel selection to pick the best forecasting model from several models, and 3) fine-tuning the parameters of an individual forecastingmodel for the highest possible accuracy. In this paper, a methodology that embraces these three virtues of data mining from a small dataset is proposed...