基于模型的纵向聚类，具有不同的集群分配

论文标题

基于模型的纵向聚类，具有不同的集群分配

Model-Based Longitudinal Clustering with Varying Cluster Assignments

论文作者

Sewell, Daniel K., Chen, Yuguo, Bernhard, William, Sulkin, Tracy

论文摘要

在纵向数据上进行聚类通常很感兴趣，但是很难制定一个直观的模型，该模型在计算上是可行的。我们提出了一种基于模型的聚类方法，用于聚类对象，随着时间的推移会观察到。所提出的模型可以看作是正常混合模型的扩展，用于聚类到纵向数据。尽管现有模型仅考虑聚类效应，但我们提出了建模每个对象的观察值的分布，作为群集效应和个体效应的混合，因此还估计了对象的行为是由其所属群集确定的。此外，重要的是要检测解释变量如何影响聚类。我们方法的一个优点是，它可以通过群集过渡概率的线性建模来处理任何类型的多个解释变量。我们使用多种递归关系实施广义的EM算法，以大大降低计算成本。在一项模拟研究中说明了我们的估计方法的准确性，并分析了美国国会数据。

It is often of interest to perform clustering on longitudinal data, yet it is difficult to formulate an intuitive model for which estimation is computationally feasible. We propose a model-based clustering method for clustering objects that are observed over time. The proposed model can be viewed as an extension of the normal mixture model for clustering to longitudinal data. While existing models only account for clustering effects, we propose modeling the distribution of the observed values of each object as a blending of a cluster effect and an individual effect, hence also giving an estimate of how much the behavior of an object is determined by the cluster to which it belongs. Further, it is important to detect how explanatory variables affect the clustering. An advantage of our method is that it can handle multiple explanatory variables of any type through a linear modeling of the cluster transition probabilities. We implement the generalized EM algorithm using several recursive relationships to greatly decrease the computational cost. The accuracy of our estimation method is illustrated in a simulation study, and U.S. Congressional data is analyzed.

下载PDF全文

下载文献需遵守相关版权规定

论文标题