论文标题
使用编码器和加强学习从不完整数据中发现的因果发现
Causal Discovery from Incomplete Data using An Encoder and Reinforcement Learning
论文作者
论文摘要
在一组变量中发现因果结构是许多领域的基本问题。但是,最新的方法很少考虑观察数据缺失值(不完整数据)的可能性,在许多现实世界中,这是无处不在的。缺失的值将大大损害性能,甚至使因果发现算法失败。在本文中,我们提出了一种通过使用新颖的编码器和增强学习(RL)从不完整数据中发现因果结构的方法。编码器设计用于缺少数据插补和特征提取。特别是,它学会了将当前可用的信息(缺少值)编码为可靠的功能表示形式,然后将其用于确定在哪里搜索最佳图表。编码器集成到RL框架中,可以使用参与者 - 批判算法进行优化。我们的方法将不完整的观察数据作为输入并生成因果结构图。关于合成和实际数据的实验结果表明,我们的方法可以从不完整的数据中稳健地产生因果结构。与数据插补和因果发现方法的直接组合相比,我们的方法的性能通常更好,甚至可以获得高达43.2%的绩效增长。
Discovering causal structure among a set of variables is a fundamental problem in many domains. However, state-of-the-art methods seldom consider the possibility that the observational data has missing values (incomplete data), which is ubiquitous in many real-world situations. The missing value will significantly impair the performance and even make the causal discovery algorithms fail. In this paper, we propose an approach to discover causal structures from incomplete data by using a novel encoder and reinforcement learning (RL). The encoder is designed for missing data imputation as well as feature extraction. In particular, it learns to encode the currently available information (with missing values) into a robust feature representation which is then used to determine where to search the best graph. The encoder is integrated into a RL framework that can be optimized using the actor-critic algorithm. Our method takes the incomplete observational data as input and generates a causal structure graph. Experimental results on synthetic and real data demonstrate that our method can robustly generate causal structures from incomplete data. Compared with the direct combination of data imputation and causal discovery methods, our method performs generally better and can even obtain a performance gain as much as 43.2%.