论文标题

有条件的Detr V2:带有框查询的有效检测变压器

Conditional DETR V2: Efficient Detection Transformer with Box Queries

论文作者

Chen, Xiaokang, Wei, Fangyun, Zeng, Gang, Wang, Jingdong

论文摘要

在本文中,我们对检测变压器(DETR)感兴趣,这是一种基于变压器编码器编码器架构的端到端对象检测方法,而无需手工制作的后处理,例如NMS。受条件DETR的启发,这是一种具有快速训练收敛的改进的DETR,它针对内部解码器层提出了盒子查询(最初称为空间查询),我们将对象查询重新加密到盒子查询格式中,这是盒子查询的格式,是参考点的参考点的嵌入和对参考点的转换的组成。该重新制定表明,在更快的R-CNN中广泛研究的DETR中的对象查询与锚固框之间的联系。此外,我们从图像内容中学习了盒子查询,从而进一步提高了通过快速训练收敛的有条件DETR的检测质量。此外,我们采用轴向自我注意的想法来节省内存成本并加速编码器。所得检测器(称为条件DETR V2)取得的结果比有条件的DETR更好,可以节省内存成本并更有效地运行。例如,对于DC $ 5 $ -Resnet- $ 50 $骨干,我们的方法在可可$ VAL $设置的$ 44.8 $ ap中获得了$ 44.8 $ fps,并且与有条件的detr相比,它运行的$ 1.6 \ tims $ $ $ faster $更快,节省了$ 74 $ \ $ 74 $ \%的整体内存成本,并提高$ 1.0 $ ap ap ap $ ap ap ap ap ap ap。

In this paper, we are interested in Detection Transformer (DETR), an end-to-end object detection approach based on a transformer encoder-decoder architecture without hand-crafted postprocessing, such as NMS. Inspired by Conditional DETR, an improved DETR with fast training convergence, that presented box queries (originally called spatial queries) for internal decoder layers, we reformulate the object query into the format of the box query that is a composition of the embeddings of the reference point and the transformation of the box with respect to the reference point. This reformulation indicates the connection between the object query in DETR and the anchor box that is widely studied in Faster R-CNN. Furthermore, we learn the box queries from the image content, further improving the detection quality of Conditional DETR still with fast training convergence. In addition, we adopt the idea of axial self-attention to save the memory cost and accelerate the encoder. The resulting detector, called Conditional DETR V2, achieves better results than Conditional DETR, saves the memory cost and runs more efficiently. For example, for the DC$5$-ResNet-$50$ backbone, our approach achieves $44.8$ AP with $16.4$ FPS on the COCO $val$ set and compared to Conditional DETR, it runs $1.6\times$ faster, saves $74$\% of the overall memory cost, and improves $1.0$ AP score.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源