论文标题
用观察到的混杂因素和介体来估计治疗效果
Estimating Treatment Effects with Observed Confounders and Mediators
论文作者
论文摘要
给定因果图,DO-Calculus可以作为观察性关节分布的功能表达治疗效果,可以从经验上估算。有时,DO-Calculus会标识多个有效的公式,促使我们比较相应估计器的统计属性。例如,当观察到所有混杂因子并在观察到的介体传播因果效应时,后门公式适用时,后门公式适用。在本文中,我们调查了观察到混杂因素和调解器的过度识别方案,使两个估计量都有效。解决线性高斯因果模型时,我们证明了任何一个估计器都可以通过无界常数因子统治对方。接下来,我们得出一个最佳估计量,该估计量利用所有观察到的变量并绑定其有限样本方差。我们表明,它严格胜过后门和前门估计器,并且这种改进可能是无限的。我们还提出了一个合并两个数据集的程序,一个数据集与观察到的混杂因素,另一个与观察到的调解人相结合。最后,我们在模拟数据以及IHDP和JTPA数据集上评估了我们的方法。
Given a causal graph, the do-calculus can express treatment effects as functionals of the observational joint distribution that can be estimated empirically. Sometimes the do-calculus identifies multiple valid formulae, prompting us to compare the statistical properties of the corresponding estimators. For example, the backdoor formula applies when all confounders are observed and the frontdoor formula applies when an observed mediator transmits the causal effect. In this paper, we investigate the over-identified scenario where both confounders and mediators are observed, rendering both estimators valid. Addressing the linear Gaussian causal model, we demonstrate that either estimator can dominate the other by an unbounded constant factor. Next, we derive an optimal estimator, which leverages all observed variables, and bound its finite-sample variance. We show that it strictly outperforms the backdoor and frontdoor estimators and that this improvement can be unbounded. We also present a procedure for combining two datasets, one with observed confounders and another with observed mediators. Finally, we evaluate our methods on both simulated data and the IHDP and JTPA datasets.