论文标题

基于自然语言命令的场景图修改

Scene Graph Modification Based on Natural Language Commands

论文作者

He, Xuanli, Tran, Quan Hung, Haffari, Gholamreza, Chang, Walter, Bui, Trung, Lin, Zhe, Dernoncourt, Franck, Dam, Nhan

论文摘要

图形和解析树等结构化表示在许多自然语言处理系统中起着至关重要的作用。近年来,多转交换用户界面的进步需要控制和更新这些结构化表示,给出了新的信息来源。尽管已经有很多努力着重于改善映射文本的解析器的性能,这些解析器将文本映射到图形或解析树,但很少有人探讨了直接操纵这些表示形式的问题。在本文中,我们探讨了图形修改的新颖问题,系统需要学习如何更新给定新用户命令的现有场景图。我们基于基于图的稀疏变压器和交叉注意信息融合的新型模型优于以前的系统,该系统根据机器翻译和图形文献的改编。我们进一步向研究社区贡献了大型图形修改数据集,以鼓励对这个新问题的未来研究。

Structured representations like graphs and parse trees play a crucial role in many Natural Language Processing systems. In recent years, the advancements in multi-turn user interfaces necessitate the need for controlling and updating these structured representations given new sources of information. Although there have been many efforts focusing on improving the performance of the parsers that map text to graphs or parse trees, very few have explored the problem of directly manipulating these representations. In this paper, we explore the novel problem of graph modification, where the systems need to learn how to update an existing scene graph given a new user's command. Our novel models based on graph-based sparse transformer and cross attention information fusion outperform previous systems adapted from the machine translation and graph generation literature. We further contribute our large graph modification datasets to the research community to encourage future research for this new problem.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源