论文标题
参考-NMS:在两个阶段参考表达接地的破坏提案瓶颈
Ref-NMS: Breaking Proposal Bottlenecks in Two-Stage Referring Expression Grounding
论文作者
论文摘要
解决参考表达接地的流行框架是基于两个阶段的过程:1)检测具有对象检测器的建议,以及2)将指称固定到其中一个建议中。现有的两阶段解决方案主要集中在基础步骤上,该步骤旨在使表达式与提案保持一致。在本文中,我们认为这些方法忽略了在两个阶段的提案的作用之间存在明显的不匹配:它们仅基于检测信心(即表达表达方式)生成建议,希望这些建议在表达式中包含所有正确的实例(即表达方式)。由于这种不匹配,当前的两阶段方法在检测到的和地面的建议之间遭受了严重的性能下降。为此,我们提出了Ref-NMS,这是在第一阶段产生表达感知建议的第一种方法。 Ref-NMS将表达式中的所有名词视为关键对象,并引入了一个轻巧的模块,以预测每个框与关键对象对齐的分数。这些分数可以指导NMS操作以滤除与表达式无关的盒子,从而增加了关键物体的回忆,从而显着改善了接地性能。由于参考文献不可知,因此可以轻松地将其集成到任何最新的两阶段方法中。对几个骨干,基准和任务的大量消融研究始终证明了REF-NM的优势。代码可在以下网址提供:https://github.com/chopinsharp/ref-nms。
The prevailing framework for solving referring expression grounding is based on a two-stage process: 1) detecting proposals with an object detector and 2) grounding the referent to one of the proposals. Existing two-stage solutions mostly focus on the grounding step, which aims to align the expressions with the proposals. In this paper, we argue that these methods overlook an obvious mismatch between the roles of proposals in the two stages: they generate proposals solely based on the detection confidence (i.e., expression-agnostic), hoping that the proposals contain all right instances in the expression (i.e., expression-aware). Due to this mismatch, current two-stage methods suffer from a severe performance drop between detected and ground-truth proposals. To this end, we propose Ref-NMS, which is the first method to yield expression-aware proposals at the first stage. Ref-NMS regards all nouns in the expression as critical objects, and introduces a lightweight module to predict a score for aligning each box with a critical object. These scores can guide the NMS operation to filter out the boxes irrelevant to the expression, increasing the recall of critical objects, resulting in a significantly improved grounding performance. Since Ref- NMS is agnostic to the grounding step, it can be easily integrated into any state-of-the-art two-stage method. Extensive ablation studies on several backbones, benchmarks, and tasks consistently demonstrate the superiority of Ref-NMS. Codes are available at: https://github.com/ChopinSharp/ref-nms.