论文标题

国家空间模型在非平稳的多元时间序列中丢失数据的多重插入以及在数字精神病学中的应用

State space model multiple imputation for missing data in non-stationary multivariate time series with application in digital Psychiatry

论文作者

Cai, Xiaoxuan, Wang, Xinru, Zeng, Li, Eichi, Habiballah Rahimi, Ongur, Dost, Dixon, Lisa, Baker, Justin T., Onnela, Jukka-Pekka, Valeri, Linda

论文摘要

移动技术可以使个人的行为,社交互动,症状和其他健康状况进行前所未有的持续监测,从而为有关精神疾病病因的治疗进步和科学发现提供了巨大的机会。移动数据的连续收集产生了一种新类型的数据:纠缠的多变量时间序列的结果,曝光和协变量。缺少数据是生物医学和社会科学研究中的一个普遍问题,并且在精神病研究中使用移动设备的生态瞬时评估(EMA)也不例外。但是,多元时间序列的复杂结构引入了处理丢失的数据以进行适当的因果推理时引入的新挑战。通常建议使用数据插补以提高数据实用性和估计效率。大多数可用的插补方法都是为随访时间有限的纵向数据设计的,或者固定时间序列与潜在的非平稳时间序列不相容。在精神病学领域,由于症状和治疗方案可能会随着时间的流逝而经历巨大的变化,因此经常遇到非平稳数据。为了解决可能非平稳的多元时间序列中缺少的数据,我们提出了一种基于状态空间模型(SSMMP)和更高效的变体(SSMIMPUTE)的新型多重插补策略。我们通过评估其理论属性和在固定时间和非平稳时间序列的模拟中评估其理论属性和经验性能,从而证明了它们比其他广泛使用的缺少数据策略的优势。我们使用SSMIMPUTE使用多年观察性智能手机对双极患者的研究,调查社交网络规模与负面情绪之间的关联,从而控制混杂的变量。

Mobile technology enables unprecedented continuous monitoring of an individual's behavior, social interactions, symptoms, and other health conditions, presenting an enormous opportunity for therapeutic advancements and scientific discoveries regarding the etiology of psychiatric illness. Continuous collection of mobile data results in the generation of a new type of data: entangled multivariate time series of outcome, exposure, and covariates. Missing data is a pervasive problem in biomedical and social science research, and the Ecological Momentary Assessment (EMA) using mobile devices in psychiatric research is no exception. However, the complex structure of multivariate time series introduces new challenges in handling missing data for proper causal inference. Data imputation is commonly recommended to enhance data utility and estimation efficiency. The majority of available imputation methods are either designed for longitudinal data with limited follow-up times or for stationary time series, which are incompatible with potentially non-stationary time series. In the field of psychiatry, non-stationary data are frequently encountered as symptoms and treatment regimens may experience dramatic changes over time. To address missing data in possibly non-stationary multivariate time series, we propose a novel multiple imputation strategy based on the state space model (SSMmp) and a more computationally efficient variant (SSMimpute). We demonstrate their advantages over other widely used missing data strategies by evaluating their theoretical properties and empirical performance in simulations of both stationary and non-stationary time series, subject to various missing mechanisms. We apply the SSMimpute to investigate the association between social network size and negative mood using a multi-year observational smartphone study of bipolar patients, controlling for confounding variables.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源