论文标题

视频中物理系统中的因果发现

Causal Discovery in Physical Systems from Videos

论文作者

Li, Yunzhu, Torralba, Antonio, Anandkumar, Animashree, Fox, Dieter, Garg, Animesh

论文摘要

因果发现是人类认知的核心。它使我们能够对环境进行推理,并对看不见的场景做出反事实预测,这些预测与我们以前的经历有很大不同。我们以端到端的方式考虑了从视频中发现因果关系的任务,而无需在地面图形结构上进行监督。特别是,我们的目标是发现环境变量和对象变量之间的结构依赖性:推断相互作用的类型和强度对动态系统的行为产生因果关系。我们的模型由(a)一个感知模块组成,该模块从图像中提取语义上有意义且具有时间一致的关键点表示,(b)用于确定由检测到的关键点诱导的图形分布的推理模块,以及(c)可以通过在推论图上进行调节来预测未来的动力学模块。我们假设访问不同的配置和环境条件,即来自基础系统上未知干预措施的数据;因此,我们可以希望在没有明确干预的情况下发现正确的基本因果图。我们在平面多体相互作用环境和场景中评估我们的方法,涉及不同形状的织物,例如衬衫和裤子。实验表明,我们的模型可以正确地识别出短序列图像的相互作用,并做出长期的未来预测。该模型假设的因果结构还允许其做出反事实预测,并将其推断到各种大小的看不见的相互作用图或图形系统。

Causal discovery is at the core of human cognition. It enables us to reason about the environment and make counterfactual predictions about unseen scenarios that can vastly differ from our previous experiences. We consider the task of causal discovery from videos in an end-to-end fashion without supervision on the ground-truth graph structure. In particular, our goal is to discover the structural dependencies among environmental and object variables: inferring the type and strength of interactions that have a causal effect on the behavior of the dynamical system. Our model consists of (a) a perception module that extracts a semantically meaningful and temporally consistent keypoint representation from images, (b) an inference module for determining the graph distribution induced by the detected keypoints, and (c) a dynamics module that can predict the future by conditioning on the inferred graph. We assume access to different configurations and environmental conditions, i.e., data from unknown interventions on the underlying system; thus, we can hope to discover the correct underlying causal graph without explicit interventions. We evaluate our method in a planar multi-body interaction environment and scenarios involving fabrics of different shapes like shirts and pants. Experiments demonstrate that our model can correctly identify the interactions from a short sequence of images and make long-term future predictions. The causal structure assumed by the model also allows it to make counterfactual predictions and extrapolate to systems of unseen interaction graphs or graphs of various sizes.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源