论文标题

无监督的概率模型用于顺序电子健康记录

Unsupervised Probabilistic Models for Sequential Electronic Health Records

论文作者

Kaplan, Alan D., Greene, John D., Liu, Vincent X., Ray, Priyadip

论文摘要

我们为异质电子健康记录(EHR)数据开发了无监督的概率模型。利用混合模型公式,我们的方法直接建模了任意长度的序列,例如药物和实验室结果。这允许亚组和掺入基础异质数据类型的动力学。该模型由一组分层的潜在变量组成,这些变量编码数据中的基础结构。这些变量代表顶层的主题亚组,而未观察到的序列是第二层的序列。我们在北加州北加州的Kaiser Permanente综合医疗保健提供系统的受试者中培训了该模型的情节数据。受过训练的模型的最终属性从这些复杂而多方面的数据中产生了新的见解。此外,我们还展示了该模型如何用于分析有助于评估死亡率可能性的序列。

We develop an unsupervised probabilistic model for heterogeneous Electronic Health Record (EHR) data. Utilizing a mixture model formulation, our approach directly models sequences of arbitrary length, such as medications and laboratory results. This allows for subgrouping and incorporation of the dynamics underlying heterogeneous data types. The model consists of a layered set of latent variables that encode underlying structure in the data. These variables represent subject subgroups at the top layer, and unobserved states for sequences in the second layer. We train this model on episodic data from subjects receiving medical care in the Kaiser Permanente Northern California integrated healthcare delivery system. The resulting properties of the trained model generate novel insight from these complex and multifaceted data. In addition, we show how the model can be used to analyze sequences that contribute to assessment of mortality likelihood.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源