论文标题

多层轨迹聚类:疾病亚型的网络算法

Multi-layer Trajectory Clustering: A Network Algorithm for Disease Subtyping

论文作者

Krishnagopal, Sanjukta

论文摘要

许多疾病在临床特征及其进展中表现出异质性,表明存在疾病亚型。提取亚型疾病变化的模式在医学中具有巨大的应用,例如在早期预后和个性化医疗疗法中。这项工作提出了一种新型,数据驱动的,基于网络的轨迹聚类(TC)算法,用于鉴定帕金森氏症基于疾病轨迹的亚型。将患者变化的相互作用建模为两分网络,TC首先提取在不同进展阶段的共表达疾病变量的群落。然后,它通过聚集了类似的患者轨迹来识别帕金森的亚型,这些轨迹的特征是通过多层网络的严重性疾病变量。轨迹相似性的确定是轨迹之间的直接重叠以及二阶相似性之间的直接重叠,即与第三组轨迹的共同重叠。该工作集群跨两种类型层的轨迹:(a)时间和(b)独立结局变量的范围(疾病严重程度的代表),这两种都产生了四个不同的亚型。前者的亚型在疾病结构域的进展(认知,心理健康等)方面表现出差异,而后一种亚型表现出不同程度的进展,即,有些则保持温和,而另一些则显示出5年后的明显恶化。通过统计分析和确定的亚型与医学文献的统计分析和一致性来验证TC方法。这种可推广且可靠的方法可以轻松地扩展到其他进行性多变量疾病数据集,并可以有效地有助于在个性化医学领域进行针对性亚型特异性治疗。

Many diseases display heterogeneity in clinical features and their progression, indicative of the existence of disease subtypes. Extracting patterns of disease variable progression for subtypes has tremendous application in medicine, for example, in early prognosis and personalized medical therapy. This work present a novel, data-driven, network-based Trajectory Clustering (TC) algorithm for identifying Parkinson's subtypes based on disease trajectory. Modeling patient-variable interactions as a bipartite network, TC first extracts communities of co-expressing disease variables at different stages of progression. Then, it identifies Parkinson's subtypes by clustering similar patient trajectories that are characterized by severity of disease variables through a multi-layer network. Determination of trajectory similarity accounts for direct overlaps between trajectories as well as second-order similarities, i.e., common overlap with a third set of trajectories. This work clusters trajectories across two types of layers: (a) temporal, and (b) ranges of independent outcome variable (representative of disease severity), both of which yield four distinct subtypes. The former subtypes exhibit differences in progression of disease domains (Cognitive, Mental Health etc.), whereas the latter subtypes exhibit different degrees of progression, i.e., some remain mild, whereas others show significant deterioration after 5 years. The TC approach is validated through statistical analyses and consistency of the identified subtypes with medical literature. This generalizable and robust method can easily be extended to other progressive multi-variate disease datasets, and can effectively assist in targeted subtype-specific treatment in the field of personalized medicine.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源