可变形的DETR：端到端对象检测的可变形变压器

论文标题

可变形的DETR：端到端对象检测的可变形变压器

Deformable DETR: Deformable Transformers for End-to-End Object Detection

论文作者

Zhu, Xizhou, Su, Weijie, Lu, Lewei, Li, Bin, Wang, Xiaogang, Dai, Jifeng

论文摘要

最近有人提出了DEDR，以消除对物体检测中许多手工设计的组件的需求，同时表现出良好的性能。然而，由于变压器注意模块在处理图像特征图中的限制，它遭受了缓慢的收敛性和有限的特征空间分辨率。为了减轻这些问题，我们提出了可变形的DEDR，其注意力模块仅关注参考文献周围的一小部分关键采样点。可变形的DETR可以比DETR（尤其是在小物体上）获得更好的性能，其训练时期少10倍。对可可基准的广泛实验证明了我们方法的有效性。代码在https://github.com/fundamentalvision/deformable-detr上发布。

DETR has been recently proposed to eliminate the need for many hand-designed components in object detection while demonstrating good performance. However, it suffers from slow convergence and limited feature spatial resolution, due to the limitation of Transformer attention modules in processing image feature maps. To mitigate these issues, we proposed Deformable DETR, whose attention modules only attend to a small set of key sampling points around a reference. Deformable DETR can achieve better performance than DETR (especially on small objects) with 10 times less training epochs. Extensive experiments on the COCO benchmark demonstrate the effectiveness of our approach. Code is released at https://github.com/fundamentalvision/Deformable-DETR.

下载PDF全文

下载文献需遵守相关版权规定

论文标题