论文标题

在线检测高维高斯图形模型的局部突然变化

Online detection of local abrupt changes in high-dimensional Gaussian graphical models

论文作者

Keshavarz, Hossein, Michailidis, George

论文摘要

由于生物学,经济学和社会科学领域的新应用,以在线方式识别高维高斯图形模型(GGM)中变化点的问题引起了人们的关注。该问题的离线版本是所有数据的先验可用,已导致许多涉及正规损失功能的方法和相关算法。但是,对于在线版本,目前文献中只有一部作品开发了一个顺序测试程序,并研究其渐近错误警报概率和功率。后一个测试最适合于检测GGM精确矩阵结构的全局变化所驱动的变化点,从而涉及许多边缘。然而,在许多实际设置中,变化点是由本地变化驱动的,因为只有少数边缘显示出变化。为此,我们开发了一个新颖的测试来解决此问题,该问题基于适当选择的传入数据的归一化协方差矩阵的$ \ ell_ \ infty $ norm。对null下提议的测试统计量的渐近分布的研究(不存在变化点)和替代性(存在变化点)假设的假设需要新的技术工具来检查依赖图依赖的高斯随机变量以及独立兴趣的最大值。进一步表明,这些工具会导致关键模型参数的轻度规律条件,而不是利用文献中相关问题中先前使用的工具所需的更严格的参数。关于合成数据的数值工作说明了众多实验环境中提出的检测程序的良好性能。

The problem of identifying change points in high-dimensional Gaussian graphical models (GGMs) in an online fashion is of interest, due to new applications in biology, economics and social sciences. The offline version of the problem, where all the data are a priori available, has led to a number of methods and associated algorithms involving regularized loss functions. However, for the online version, there is currently only a single work in the literature that develops a sequential testing procedure and also studies its asymptotic false alarm probability and power. The latter test is best suited for the detection of change points driven by global changes in the structure of the precision matrix of the GGM, in the sense that many edges are involved. Nevertheless, in many practical settings the change point is driven by local changes, in the sense that only a small number of edges exhibit changes. To that end, we develop a novel test to address this problem that is based on the $\ell_\infty$ norm of the normalized covariance matrix of an appropriately selected portion of incoming data. The study of the asymptotic distribution of the proposed test statistic under the null (no presence of a change point) and the alternative (presence of a change point) hypotheses requires new technical tools that examine maxima of graph-dependent Gaussian random variables, and that of independent interest. It is further shown that these tools lead to the imposition of mild regularity conditions for key model parameters, instead of more stringent ones required by leveraging previously used tools in related problems in the literature. Numerical work on synthetic data illustrates the good performance of the proposed detection procedure both in terms of computational and statistical efficiency across numerous experimental settings.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源