论文标题
从观察中学习和抽样原子干预措施
Learning and Sampling of Atomic Interventions from Observations
论文作者
论文摘要
我们研究了使用因果贝叶斯网络中的观测样本有效估计干预对单个变量(原子干预)的影响的问题。我们的目标是提供在非参数环境中在时间和样本复杂性上都有效的算法。 田和珍珠(AAAI`02)已经完全表征了可因果关系的因果关系类别,可以从观察数据中鉴定出原子干预的因果影响。我们将其结果定量。假设p是n个可观察变量的集合$ \ vec {v} $的因果模型,相对于带有可观察分布的给定因果图G。令$ p_x $表示对观测值的介入分布,相对于指定变量x带x的干预。 Assuming that $G$ has bounded in-degree, bounded c-components ($k$), and that the observational distribution is identifiable and satisfies certain strong positivity condition, we give an algorithm that takes $m=\tilde{O}(nε^{-2})$ samples from $P$ and $O(mn)$ time, and outputs with high probability a description of a distribution $ \ hat {p} $使得$ d _ {\ mathrm {tv}}}}(p_x,\ hat {p})\leqε$,和: 1。[评估]描述可以在$ o(n)$ time中返回概率$ \ hat {p}(\ vec {v})$ for nistmentment $ \ vec {v} $ to $ \ vec {vec {v} $ 2。[生成]描述可以从$ o(n)$ time中返回$ \ hat {p} $。 我们还显示了样本复杂性的下限,这表明我们的样本复杂性对参数$ n $和$ε$以及强阳性参数的$ k = 1 $具有最佳依赖性。
We study the problem of efficiently estimating the effect of an intervention on a single variable (atomic interventions) using observational samples in a causal Bayesian network. Our goal is to give algorithms that are efficient in both time and sample complexity in a non-parametric setting. Tian and Pearl (AAAI `02) have exactly characterized the class of causal graphs for which causal effects of atomic interventions can be identified from observational data. We make their result quantitative. Suppose P is a causal model on a set $\vec{V}$ of n observable variables with respect to a given causal graph G with observable distribution $P$. Let $P_x$ denote the interventional distribution over the observables with respect to an intervention of a designated variable X with x. Assuming that $G$ has bounded in-degree, bounded c-components ($k$), and that the observational distribution is identifiable and satisfies certain strong positivity condition, we give an algorithm that takes $m=\tilde{O}(nε^{-2})$ samples from $P$ and $O(mn)$ time, and outputs with high probability a description of a distribution $\hat{P}$ such that $d_{\mathrm{TV}}(P_x, \hat{P}) \leq ε$, and: 1. [Evaluation] the description can return in $O(n)$ time the probability $\hat{P}(\vec{v})$ for any assignment $\vec{v}$ to $\vec{V}$ 2. [Generation] the description can return an iid sample from $\hat{P}$ in $O(n)$ time. We also show lower bounds for the sample complexity showing that our sample complexity has an optimal dependence on the parameters $n$ and $ε$, as well as if $k=1$ on the strong positivity parameter.