重要性的加权结构学习场景图生成

论文标题

重要性的加权结构学习场景图生成

Importance Weighted Structure Learning for Scene Graph Generation

论文作者

Liu, Daqi, Bober, Miroslaw, Kittler, Josef

论文摘要

场景图生成是一个结构化的预测任务，旨在通过为输入图像构造视觉上的场景图来明确建模对象及其关系。当前，通过基于神经网络的消息基于基于神经网络的平均差异贝叶斯方法论是这种任务的无处不在解决方案，其中通常认为变异推理目标是经典证据下限。但是，从这种松散的客观中推断出的变异近似通常低估了下面的后部，这通常会导致下一代的性能。在本文中，我们提出了一种新颖的重要性加权结构学习方法，旨在近似具有更严重的加权下限的基础对数分区函数，这是根据从可重新量化的gumbel-softmax采样器中得出的多个样品计算得出的。应用通用的熵镜下降算法来求解所得约束的变异推理任务。所提出的方法在各种流行的场景图生成基准中实现了最新的性能。

Scene graph generation is a structured prediction task aiming to explicitly model objects and their relationships via constructing a visually-grounded scene graph for an input image. Currently, the message passing neural network based mean field variational Bayesian methodology is the ubiquitous solution for such a task, in which the variational inference objective is often assumed to be the classical evidence lower bound. However, the variational approximation inferred from such loose objective generally underestimates the underlying posterior, which often leads to inferior generation performance. In this paper, we propose a novel importance weighted structure learning method aiming to approximate the underlying log-partition function with a tighter importance weighted lower bound, which is computed from multiple samples drawn from a reparameterizable Gumbel-Softmax sampler. A generic entropic mirror descent algorithm is applied to solve the resulting constrained variational inference task. The proposed method achieves the state-of-the-art performance on various popular scene graph generation benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题