论文标题

通过交叉拟合的多处理因果效应的半参数估计

Semiparametric Estimation on Multi-treatment Causal Effects via Cross-Fitting

论文作者

Zeng, Jingying

论文摘要

因果推断是一个重要的研究领域,具有多学科的起源和应用,从统计学,计算机科学,经济学,心理学到公共卫生。在许多科学研究中,随机实验为数十年来估计因果影响提供了黄金标准。但是,在许多情况下,在实践中不可行,因此从业者需要依靠实证研究来进行因果推理。通过观察数据的因果推断是一项具有挑战性的任务,因为缺少治疗分配机制的知识,这通常需要非检验的假设才能使推理成为可能。几年来,巨大的努力一直致力于研究二元治疗的因果推断。在实践中,在多种治疗比较上使用观察数据也很常见。在潜在的结果框架内,我们提出了一个广义的交叉估计量(GCF),该估计量(GCF)通过二元治疗进行了多种治疗比较,将双重稳定的估计量推广,并提供了有关其统计特性的严格证明。该估计器允许使用更灵活的机器学习方法来对滋扰部分进行建模,并基于相对较弱的假设,而对于有效的统计推断仍然存在理论保证。我们显示了GCF估计量的渐近性能,并提供了渐近的同时置信区间,以达到平均治疗效果的半参数效率。基于因果推理文献中通常考虑的常见评估指标,通过仿真研究访问估计量的性能。

Causal inference is a critical research area with multi-disciplinary origins and applications, ranging from statistics, computer science, economics, psychology to public health. In many scientific research, randomized experiments provide a golden standard for estimation of causal effects for decades. However, in many situations, randomized experiments are not feasible in practice so that practitioners need to rely on empirical investigation for causal reasoning. Causal inference via observational data is a challenging task since the knowledge of the treatment assignment mechanism is missing, which typically requires non-testable assumptions to make the inference possible. For several years, great effort has been devoted to the research of causal inference for binary treatments. In practice, it is also common to use observational data on multiple treatment comparisons. Within the potential outcomes framework, we propose a generalized cross-fitting estimator (GCF), which generalizes the doubly robust estimator with cross-fitting for binary treatment to multiple treatment comparisons and provides rigorous proofs on its statistical properties. This estimator permits the use of more flexible machine learning methods to model the nuisance parts, and based on relatively weak assumptions, while there is still a theoretical guarantee for valid statistical inference. We show the asymptotic properties of the GCF estimators, and provide the asymptotic simultaneous confidence intervals that achieve the semiparametric efficiency bound for average treatment effect. The performance of the estimator is accessed through simulation study based on the common evaluation metrics generally considered in the causal inference literature.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源