使用上下文样式图和图形细化来检测对象

论文标题

使用上下文样式图和图形细化来检测对象

Detecting Objects with Context-Likelihood Graphs and Graph Refinement

论文作者

Bhowmik, Aritra, Wang, Yu, Baka, Nora, Oswald, Martin R., Snoek, Cees G. M.

论文摘要

本文的目的是通过利用其相互关系来检测对象。与现有的方法相反，现有方法分别学习对象和关系，我们的关键思想是共同学习对象关系分布。我们首先提出了一种新颖的方式，可以从对象间关系先验和初始类预测中创建图像的图形表示，我们称之为上下文样式图。然后，我们使用基于能量的建模技术学习了关节分布，该技术允许在给定图像上迭代地对上下文样式图进行采样。我们共同学习分布的表述使我们能够生成图像的更准确的图表表示，从而导致更好的对象检测性能。我们通过在视觉基因组和MS-Coco数据集中进行实验来证明我们上下文样式图公式的好处，以及基于能量的图形细化，在这些实验中，我们对诸如DETR和更快的RCNN（例如rcnn and faster-rcnn）进行了一致的改进，以及替代方法分别建模对象相互关系。我们的方法是检测器不可知论，端到端可训练，对稀有物体类别尤其有益。

The goal of this paper is to detect objects by exploiting their interrelationships. Contrary to existing methods, which learn objects and relations separately, our key idea is to learn the object-relation distribution jointly. We first propose a novel way of creating a graphical representation of an image from inter-object relation priors and initial class predictions, we call a context-likelihood graph. We then learn the joint distribution with an energy-based modeling technique which allows to sample and refine the context-likelihood graph iteratively for a given image. Our formulation of jointly learning the distribution enables us to generate a more accurate graph representation of an image which leads to a better object detection performance. We demonstrate the benefits of our context-likelihood graph formulation and the energy-based graph refinement via experiments on the Visual Genome and MS-COCO datasets where we achieve a consistent improvement over object detectors like DETR and Faster-RCNN, as well as alternative methods modeling object interrelationships separately. Our method is detector agnostic, end-to-end trainable, and especially beneficial for rare object classes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题