论文标题
贝叶斯非参数条件两样本测试,并应用于本地因果发现
A Bayesian Nonparametric Conditional Two-sample Test with an Application to Local Causal Discovery
论文作者
论文摘要
对于连续的随机变量$ z $,测试条件独立性$ x \ perp \!\!\!\ perp y | z $是一个特别困难的问题。它构成了许多基于约束的因果发现算法的关键要素。这些算法通常应用于包含二进制变量的数据集,这些变量指示观测值的“上下文”,例如实验中的对照组或治疗组。在这些设置中,使用$ x $或$ y $ binary(和另一种连续)的有条件独立测试对于因果发现算法的性能至关重要。据我们所知,目前没有非参数“混合”条件独立性测试,在实践中,假设所有变量是连续的测试。在本文中,我们旨在填补这一空白,因为我们结合了Holmes等人的要素。 (2015)和Teymur和Filippi(2020)提出了一种新型的贝叶斯非参数条件两样本测试。应用于局部因果发现算法,我们研究了其在合成数据和现实世界数据上的性能,并与最新的条件独立性测试进行比较。
For a continuous random variable $Z$, testing conditional independence $X \perp\!\!\!\perp Y |Z$ is known to be a particularly hard problem. It constitutes a key ingredient of many constraint-based causal discovery algorithms. These algorithms are often applied to datasets containing binary variables, which indicate the 'context' of the observations, e.g. a control or treatment group within an experiment. In these settings, conditional independence testing with $X$ or $Y$ binary (and the other continuous) is paramount to the performance of the causal discovery algorithm. To our knowledge no nonparametric 'mixed' conditional independence test currently exists, and in practice tests that assume all variables to be continuous are used instead. In this paper we aim to fill this gap, as we combine elements of Holmes et al. (2015) and Teymur and Filippi (2020) to propose a novel Bayesian nonparametric conditional two-sample test. Applied to the Local Causal Discovery algorithm, we investigate its performance on both synthetic and real-world data, and compare with state-of-the-art conditional independence tests.