所有建议是否应在对象检测中平等处理？

论文标题

所有建议是否应在对象检测中平等处理？

Should All Proposals be Treated Equally in Object Detection?

论文作者

Li, Yunsheng, Chen, Yinpeng, Dai, Xiyang, Chen, Dongdong, Liu, Mengchen, Yu, Pei, Yin, Jing, Yuan, Lu, Liu, Zicheng, Vasconcelos, Nuno

论文摘要

对象检测器的复杂性过度权衡是资源约束视觉任务的关键问题。先前的工作强调了用有效的骨干实现的探测器。在这项工作中，研究了对检测负责人对提案处理的这种权衡的影响。假设提高的检测效率需要范式转移，朝着不平等的建议处理，将比良好建议分配更多的计算。这可以更好地利用可用的计算预算，从而为同一失败提供了更高的精度。我们将其作为一个学习问题提出，目的是将操作员分配给检测头的提案，以便将总计算成本受到限制，并且精确度最大化。关键发现是，可以将这种匹配作为一个函数，该函数将每个提案嵌入到操作员的一式式代码中。尽管此功能诱导了复杂的动态网络路由机制，但它可以由简单的MLP实现，并使用现成的对象检测器端到端学习。对于给定的计算复杂性，这种“动态建议处理”（DPP）（DPP）被证明超过了最先进的端到端对象检测器（DETR，稀疏R-CNN）。

The complexity-precision trade-off of an object detector is a critical problem for resource constrained vision tasks. Previous works have emphasized detectors implemented with efficient backbones. The impact on this trade-off of proposal processing by the detection head is investigated in this work. It is hypothesized that improved detection efficiency requires a paradigm shift, towards the unequal processing of proposals, assigning more computation to good proposals than poor ones. This results in better utilization of available computational budget, enabling higher accuracy for the same FLOPS. We formulate this as a learning problem where the goal is to assign operators to proposals, in the detection head, so that the total computational cost is constrained and the precision is maximized. The key finding is that such matching can be learned as a function that maps each proposal embedding into a one-hot code over operators. While this function induces a complex dynamic network routing mechanism, it can be implemented by a simple MLP and learned end-to-end with off-the-shelf object detectors. This 'dynamic proposal processing' (DPP) is shown to outperform state-of-the-art end-to-end object detectors (DETR, Sparse R-CNN) by a clear margin for a given computational complexity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题