论文标题
与潜在混杂因素的自相关时间序列的高核心因果发现
High-recall causal discovery for autocorrelated time series with latent confounders
论文作者
论文摘要
我们提出了一种新方法,用于在潜在混杂因素的存在下,从观察时间序列序列序列中的线性和非线性,滞后和同时约束的因果发现。我们表明,在自相关时间序列案例中,现有的因果发现方法(例如FCI和变体)患有低召回率,并确定有条件独立测试的低效果大小是主要原因。信息理论论证表明,如果因果父母将因果父母包括在调节集中,通常可以增加效果大小。为了尽早确定父母,我们建议一种迭代程序,该程序利用新颖的定向规则来确定在边缘去除阶段已经已经存在的祖先关系。我们证明该方法是与订单无关的,在Oracle情况下是声音和完整的。针对不同数量的变量,时间滞后,样本量和进一步案例的大量仿真研究表明,对于自相关连续变量而言,同时将误报保持在所需的水平,而我们的方法确实比现有方法要高得多。这种性能增长会增长更强的自相关。在https://github.com/jakobrunge/tigramite上,我们为模拟研究所涉及的所有方法提供Python代码。
We present a new method for linear and nonlinear, lagged and contemporaneous constraint-based causal discovery from observational time series in the presence of latent confounders. We show that existing causal discovery methods such as FCI and variants suffer from low recall in the autocorrelated time series case and identify low effect size of conditional independence tests as the main reason. Information-theoretical arguments show that effect size can often be increased if causal parents are included in the conditioning sets. To identify parents early on, we suggest an iterative procedure that utilizes novel orientation rules to determine ancestral relationships already during the edge removal phase. We prove that the method is order-independent, and sound and complete in the oracle case. Extensive simulation studies for different numbers of variables, time lags, sample sizes, and further cases demonstrate that our method indeed achieves much higher recall than existing methods for the case of autocorrelated continuous variables while keeping false positives at the desired level. This performance gain grows with stronger autocorrelation. At https://github.com/jakobrunge/tigramite we provide Python code for all methods involved in the simulation studies.