论文标题

在野兽中的在线贝叶斯系统动力学推断,并应用于流行病

Online Bayesian phylodynamic inference in BEAST with application to epidemic reconstruction

论文作者

Gill, Mandev S., Lemey, Philippe, Suchard, Marc A., Rambaut, Andrew, Baele, Guy

论文摘要

从遗传数据中重建病原体动力学在爆发或流行病期间可用,这代表了一个重要的统计场景,其中观察结果依次依次到达,并且有兴趣以“在线”方式进行推断。为此目的,未建立广泛使用的贝叶斯系统发育推理包,通常需要一个重新计算树木和新数据到达时从头开始的进化模型参数。为了适应贝叶斯系统发育框架中增加的数据流,我们引入了一种方法,以有效地使用新近可用的遗传数据更新后验分布。我们的程序是在BEAST 1.10软件包中实施的,并依靠基于距离的措施将新的分类单元插入系统发育的当前估计值,并为新模型参数提供了合理的值,以适应增长的维度。这种增强创造了知情的起始价值并重新使用最佳调整的过渡内核,以后验探索不断增长的数据集,从而减少了收敛到目标后分布所需的时间。我们将我们的框架应用于最近西非埃博拉病毒流行病的数据,并证明在爆发的不同时间点获得后验估计所需的时间大大减少。除了流行监测之外,该框架还可以轻松地在系统发育社区中找到其他应用程序,在该社区中,数据的变化(就一致性变化,序列添加或删除而言)提供了可以从在线推论中受益的常见场景。

Reconstructing pathogen dynamics from genetic data as they become available during an outbreak or epidemic represents an important statistical scenario in which observations arrive sequentially in time and one is interested in performing inference in an 'online' fashion. Widely-used Bayesian phylogenetic inference packages are not set up for this purpose, generally requiring one to recompute trees and evolutionary model parameters de novo when new data arrive. To accommodate increasing data flow in a Bayesian phylogenetic framework, we introduce a methodology to efficiently update the posterior distribution with newly available genetic data. Our procedure is implemented in the BEAST 1.10 software package, and relies on a distance-based measure to insert new taxa into the current estimate of the phylogeny and imputes plausible values for new model parameters to accommodate growing dimensionality. This augmentation creates informed starting values and re-uses optimally tuned transition kernels for posterior exploration of growing data sets, reducing the time necessary to converge to target posterior distributions. We apply our framework to data from the recent West African Ebola virus epidemic and demonstrate a considerable reduction in time required to obtain posterior estimates at different time points of the outbreak. Beyond epidemic monitoring, this framework easily finds other applications within the phylogenetics community, where changes in the data -- in terms of alignment changes, sequence addition or removal -- present common scenarios that can benefit from online inference.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源