语义指导图像支出的场景图扩展

论文标题

语义指导图像支出的场景图扩展

Scene Graph Expansion for Semantics-Guided Image Outpainting

论文作者

Yang, Chiao-An, Tan, Cheng-Yo, Fan, Wan-Cyuan, Yang, Cheng-Fu, Wu, Meng-Lin, Wang, Yu-Chiang Frank

论文摘要

在本文中，我们解决了语义指导图像支出的任务，即通过生成语义上实用的内容来完成图像。与大多数现有的图像支出作品不同，我们通过在场景图级别上理解和完成图像语义来处理上述任务。特别是，我们提出了一个新颖的场景图形变压器（SGT）网络，该网络旨在将节点和边缘特征作为建模相关结构信息的输入。为了更好地理解和处理基于图的输入，我们的SGT独特地在节点和边缘级别上都具有关注。前者认为边缘是关系正则化，而后者则观察到节点的同时存在以指导注意力过程。我们证明，鉴于具有其布局和场景图的部分输入图像，我们的SGT可以用于场景图扩展以及其转换为完整的布局。遵循最新的布局到图像转换，可以通过介绍足够且实用的语义来完成图像支出的任务。在MS-Coco和Visual Genome的数据集上进行了广泛的实验，该数据集在定量和定性上证实了我们提出的SGT和支出框架的有效性。

In this paper, we address the task of semantics-guided image outpainting, which is to complete an image by generating semantically practical content. Different from most existing image outpainting works, we approach the above task by understanding and completing image semantics at the scene graph level. In particular, we propose a novel network of Scene Graph Transformer (SGT), which is designed to take node and edge features as inputs for modeling the associated structural information. To better understand and process graph-based inputs, our SGT uniquely performs feature attention at both node and edge levels. While the former views edges as relationship regularization, the latter observes the co-occurrence of nodes for guiding the attention process. We demonstrate that, given a partial input image with its layout and scene graph, our SGT can be applied for scene graph expansion and its conversion to a complete layout. Following state-of-the-art layout-to-image conversions works, the task of image outpainting can be completed with sufficient and practical semantics introduced. Extensive experiments are conducted on the datasets of MS-COCO and Visual Genome, which quantitatively and qualitatively confirm the effectiveness of our proposed SGT and outpainting frameworks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题