论文标题
介入的因果代表学习
Interventional Causal Representation Learning
论文作者
论文摘要
因果表示学习旨在从低级感觉数据中提取高级潜在因素。大多数现有方法依赖于观察数据和结构假设(例如,有条件独立性)来识别潜在因素。但是,介入的数据在应用程序中普遍存在。介入数据可以促进因果代表学习吗?我们在本文中探讨了这个问题。关键观察结果是,介入数据通常带有潜在因素支持的几何标志(即,每个潜在的值可能需要什么值)。例如,当潜在因素与因果关系连接时,干预措施可能会破坏中间潜在的支持及其祖先之间的依赖性。利用这一事实,我们证明可以将潜在的因果因素确定为置换和从完美$ do $ $ the Destions中的数据进行扩展。此外,我们可以实现块仿射识别,即,如果我们可以从不完善的干预措施中访问数据,则估计的潜在因素只会与其他几种潜在因素纠缠在一起。这些结果突出了因果表示学习中介入数据的独特力量;他们可以实现可证明对潜在因素的识别,而无需对其分布或依赖结构的任何假设。
Causal representation learning seeks to extract high-level latent factors from low-level sensory data. Most existing methods rely on observational data and structural assumptions (e.g., conditional independence) to identify the latent factors. However, interventional data is prevalent across applications. Can interventional data facilitate causal representation learning? We explore this question in this paper. The key observation is that interventional data often carries geometric signatures of the latent factors' support (i.e. what values each latent can possibly take). For example, when the latent factors are causally connected, interventions can break the dependency between the intervened latents' support and their ancestors'. Leveraging this fact, we prove that the latent causal factors can be identified up to permutation and scaling given data from perfect $do$ interventions. Moreover, we can achieve block affine identification, namely the estimated latent factors are only entangled with a few other latents if we have access to data from imperfect interventions. These results highlight the unique power of interventional data in causal representation learning; they can enable provable identification of latent factors without any assumptions about their distributions or dependency structure.