论文标题
本地搜索有效的因果效应估计
Local search for efficient causal effect estimation
论文作者
论文摘要
观察数据的因果效应估计是一个具有挑战性的问题,尤其是在高维数据和存在未观察到的变量的情况下。解决该问题的可用数据驱动方法可以估计因果效应(即非唯一估计)的边界,或者具有较低的效率。在尝试获得独特而公正的因果效应估计的同时,实现高效率的主要障碍是如何找到适当的调整集以快速混淆控制,鉴于巨大的协变量空间并考虑了未观察到的变量。在本文中,我们将问题作为本地搜索任务,以查找数据中的有效调整集。我们建立了支持本地搜索调整集的定理,我们表明,即使存在未观察到的变量,也可以从观察数据中实现独特而公正的估计。然后,我们提出了一种数据驱动的算法,该算法在轻度假设下快速且一致。我们还利用一种频繁的模式挖掘方法来进一步加快因果效应估计的最小调整集的搜索。在广泛的合成和现实世界数据集上进行的实验表明,所提出的算法在准确性和时间效率上都优于最新标准/估计器。
Causal effect estimation from observational data is a challenging problem, especially with high dimensional data and in the presence of unobserved variables. The available data-driven methods for tackling the problem either provide an estimation of the bounds of a causal effect (i.e. nonunique estimation) or have low efficiency. The major hurdle for achieving high efficiency while trying to obtain unique and unbiased causal effect estimation is how to find a proper adjustment set for confounding control in a fast way, given the huge covariate space and considering unobserved variables. In this paper, we approach the problem as a local search task for finding valid adjustment sets in data. We establish the theorems to support the local search for adjustment sets, and we show that unique and unbiased estimation can be achieved from observational data even when there exist unobserved variables. We then propose a data-driven algorithm that is fast and consistent under mild assumptions. We also make use of a frequent pattern mining method to further speed up the search of minimal adjustment sets for causal effect estimation. Experiments conducted on extensive synthetic and real-world datasets demonstrate that the proposed algorithm outperforms the state-of-the-art criteria/estimators in both accuracy and time-efficiency.