论文标题

学习动作效果的动力学从一对场景图纸

Learning Action-Effect Dynamics from Pairs of Scene-graphs

论文作者

Sampat, Shailaja Keyur, Banerjee, Pratyay, Yang, Yezhou, Baral, Chitta

论文摘要

“行动”在人类与世界互动的方式中起着至关重要的作用。因此,将有助于我们进行日常任务的自主代理也需要能够执行“行动与变革的推理”(RAC)。最近,人们对使用视觉和语言输入的RAC研究越来越感兴趣。图通常用于表示视觉内容的语义结构(即对象,它们的属性和对象之间的关系),通常称为场景图。在这项工作中,我们提出了一种新颖的方法,该方法利用图像的场景图表来推理自然语言所描述的动作的影响。我们尝试了现有的CLEVR_HYP(Sampat等,2021)数据集,并表明我们所提出的方法与现有模型相比在性能,数据效率和概括能力方面有效。

'Actions' play a vital role in how humans interact with the world. Thus, autonomous agents that would assist us in everyday tasks also require the capability to perform 'Reasoning about Actions & Change' (RAC). Recently, there has been growing interest in the study of RAC with visual and linguistic inputs. Graphs are often used to represent semantic structure of the visual content (i.e. objects, their attributes and relationships among objects), commonly referred to as scene-graphs. In this work, we propose a novel method that leverages scene-graph representation of images to reason about the effects of actions described in natural language. We experiment with existing CLEVR_HYP (Sampat et. al, 2021) dataset and show that our proposed approach is effective in terms of performance, data efficiency, and generalization capability compared to existing models.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源