论文标题
dab-det:动态锚点是更好的查询
DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
论文作者
论文摘要
我们在本文中介绍了使用动态锚定盒进行DETR(检测变压器)的新型查询公式,并对查询在DITR中的作用提供了更深入的了解。该新公式直接将框坐标用作变压器解码器中的查询,并动态更新它们。使用盒子坐标不仅有助于使用明确的位置先验来提高查询对功能的相似性并消除DETR中的慢训练收敛问题,还允许我们使用盒子宽度和高度信息来调节位置注意力图。这样的设计清楚地表明,DETR中的查询可以以级联的方式执行逐层的柔软ROI合并。结果,它在同一设置下,在类似detr的检测模型中,在MS-Coco基准测试中的最佳性能,例如,使用RESNET50-DC5作为在50个时期训练的骨架训练的AP 45.7 \%。我们还进行了广泛的实验,以确认我们的分析并验证方法的有效性。代码可在\ url {https://github.com/slongliu/dab-detr}中获得。
We present in this paper a novel query formulation using dynamic anchor boxes for DETR (DEtection TRansformer) and offer a deeper understanding of the role of queries in DETR. This new formulation directly uses box coordinates as queries in Transformer decoders and dynamically updates them layer-by-layer. Using box coordinates not only helps using explicit positional priors to improve the query-to-feature similarity and eliminate the slow training convergence issue in DETR, but also allows us to modulate the positional attention map using the box width and height information. Such a design makes it clear that queries in DETR can be implemented as performing soft ROI pooling layer-by-layer in a cascade manner. As a result, it leads to the best performance on MS-COCO benchmark among the DETR-like detection models under the same setting, e.g., AP 45.7\% using ResNet50-DC5 as backbone trained in 50 epochs. We also conducted extensive experiments to confirm our analysis and verify the effectiveness of our methods. Code is available at \url{https://github.com/SlongLiu/DAB-DETR}.