RTMDET：设计实时对象探测器的实证研究

论文标题

RTMDET：设计实时对象探测器的实证研究

RTMDet: An Empirical Study of Designing Real-Time Object Detectors

论文作者

Lyu, Chengqi, Zhang, Wenwei, Huang, Haian, Zhou, Yue, Wang, Yudong, Liu, Yanyi, Zhang, Shilong, Chen, Kai

论文摘要

在本文中，我们旨在设计一个超过YOLO系列的有效实时对象检测器，并且对于许多对象识别任务（例如实例分割和旋转对象检测）而言，很容易扩展。为了获得更有效的模型体系结构，我们探索了一种具有兼容能力的体系结构，该体系结构是由一个基本的构建块构建的，该基本构建块由大内核深度卷积组成。在计算动态标签分配中的匹配成本以提高准确性时，我们进一步引入了软标签。加上更好的训练技术，所得的对象检测器（称为RTMDET）在COCO上获得52.8％的AP，在NVIDIA 3090 GPU上使用300+ fps，表现优于当前主流工业探测器。 RTMDET通过小型/小型/中/大/超级/超大型模型来实现最佳参数 - 准确性权衡，以实现各种应用程序方案，并在实时实例细分和旋转对象检测方面获得了新的最新性能。我们希望实验结果可以为设计多种对象识别任务设计多功能实时对象探测器提供新的见解。代码和模型在https://github.com/open-mmlab/mmdetection/tree/3.x/configs/rtmdet上发布。

In this paper, we aim to design an efficient real-time object detector that exceeds the YOLO series and is easily extensible for many object recognition tasks such as instance segmentation and rotated object detection. To obtain a more efficient model architecture, we explore an architecture that has compatible capacities in the backbone and neck, constructed by a basic building block that consists of large-kernel depth-wise convolutions. We further introduce soft labels when calculating matching costs in the dynamic label assignment to improve accuracy. Together with better training techniques, the resulting object detector, named RTMDet, achieves 52.8% AP on COCO with 300+ FPS on an NVIDIA 3090 GPU, outperforming the current mainstream industrial detectors. RTMDet achieves the best parameter-accuracy trade-off with tiny/small/medium/large/extra-large model sizes for various application scenarios, and obtains new state-of-the-art performance on real-time instance segmentation and rotated object detection. We hope the experimental results can provide new insights into designing versatile real-time object detectors for many object recognition tasks. Code and models are released at https://github.com/open-mmlab/mmdetection/tree/3.x/configs/rtmdet.

下载PDF全文

下载文献需遵守相关版权规定

论文标题