评估定向和多元依赖的多元度量的重要性，时间序列之间的线性依赖性

论文标题

评估定向和多元依赖的多元度量的重要性，时间序列之间的线性依赖性

Assessing the Significance of Directed and Multivariate Measures of Linear Dependence Between Time Series

论文作者

Cliff, Oliver M., Novelli, Leonardo, Fulcher, Ben D., Shine, James M., Lizier, Joseph T.

论文摘要

推断时间序列之间的线性依赖性对于我们对自然和人工系统的理解至关重要。不幸的是，用于确定时间序列数据的具有统计学意义的定向或多元关系的假设检验通常会产生虚假的关联（I型错误）或省略因果关系（II型错误）。这是由于分析时间序列中存在的自相关 - 从大脑动力学到气候变化的各种应用程序中无处不在的属性。在这里，我们表明，对于有限的数据，不能通过单独拟合时间序列模型（例如，在Granger因果关系或预先晶格方法中）来介导，而应更改统计测试中自由度的自由度以说明在观察中通过交叉相关引起的有效样本大小。这种洞察力使我们能够针对基于多元相关的基于多变量相关的衡量协方差时间序列序列（包括Granger因果关系和与高斯边际的共同信息）进行修改的假设检验。我们使用数值仿真（通过自动回归模型和数字过滤生成）以及记录的fMRI-NEUROIMANIGANG数据，以表明我们的测试对各种固定时间序列均无偏见。我们的实验表明，使用$ f $ - 和$χ^2 $检验可能会导致两种措施的大量假阳性速率高达$ 100 \％$ $，并且没有预先降低信号。这些发现表明，如果在分析时间序列分析时未使用修改的假设检验，那么科学文献中报道的许多依赖性可能已经并且可能会继续被误以为或错过。

Inferring linear dependence between time series is central to our understanding of natural and artificial systems. Unfortunately, the hypothesis tests that are used to determine statistically significant directed or multivariate relationships from time-series data often yield spurious associations (Type I errors) or omit causal relationships (Type II errors). This is due to the autocorrelation present in the analysed time series -- a property that is ubiquitous across diverse applications, from brain dynamics to climate change. Here we show that, for limited data, this issue cannot be mediated by fitting a time-series model alone (e.g., in Granger causality or prewhitening approaches), and instead that the degrees of freedom in statistical tests should be altered to account for the effective sample size induced by cross-correlations in the observations. This insight enabled us to derive modified hypothesis tests for any multivariate correlation-based measures of linear dependence between covariance-stationary time series, including Granger causality and mutual information with Gaussian marginals. We use both numerical simulations (generated by autoregressive models and digital filtering) as well as recorded fMRI-neuroimaging data to show that our tests are unbiased for a variety of stationary time series. Our experiments demonstrate that the commonly used $F$- and $χ^2$-tests can induce significant false-positive rates of up to $100\%$ for both measures, with and without prewhitening of the signals. These findings suggest that many dependencies reported in the scientific literature may have been, and may continue to be, spuriously reported or missed if modified hypothesis tests are not used when analysing time series.

下载PDF全文

下载文献需遵守相关版权规定

论文标题