论文标题
有效的:一种有效,准确且可扩展的端到端6D多对象估计方法
EfficientPose: An efficient, accurate and scalable end-to-end 6D multi object pose estimation approach
论文作者
论文摘要
在本文中,我们引入了有效的,这是一种用于6D对象姿势估计的新方法。我们的方法在广泛的计算资源上是高度准确,高效且可扩展的。此外,它可以检测到多个对象和实例的2D边界框,并在一次镜头中估算其完整的6D姿势。这消除了处理其他方法所遭受的多种对象时的运行时大幅增加。这些方法旨在首先检测2D目标,例如关键点,并为每个对象的6D姿势求解透视点n点问题。我们还提出了一种新型的增强方法,用于直接6D姿势估计方法,以提高性能和概括,称为6D增强。我们的方法在使用RGB输入的广泛使用的6D姿势估计基准数据集linemod上,在27 fps的端到端运行时,在广泛使用的6D姿势估计基准数据集linemod上获得了97.35%的新最新准确性。通过对多个对象和实例的固有处理以及融合的单镜头2D对象检测以及6D姿势估计,我们的方法即使在26 fps以上的端到端多个对象(八个),使其对许多现实世界情景具有很高的吸引力。代码将在https://github.com/ybkscht/efficicpose上公开提供。
In this paper we introduce EfficientPose, a new approach for 6D object pose estimation. Our method is highly accurate, efficient and scalable over a wide range of computational resources. Moreover, it can detect the 2D bounding box of multiple objects and instances as well as estimate their full 6D poses in a single shot. This eliminates the significant increase in runtime when dealing with multiple objects other approaches suffer from. These approaches aim to first detect 2D targets, e.g. keypoints, and solve a Perspective-n-Point problem for their 6D pose for each object afterwards. We also propose a novel augmentation method for direct 6D pose estimation approaches to improve performance and generalization, called 6D augmentation. Our approach achieves a new state-of-the-art accuracy of 97.35% in terms of the ADD(-S) metric on the widely-used 6D pose estimation benchmark dataset Linemod using RGB input, while still running end-to-end at over 27 FPS. Through the inherent handling of multiple objects and instances and the fused single shot 2D object detection as well as 6D pose estimation, our approach runs even with multiple objects (eight) end-to-end at over 26 FPS, making it highly attractive to many real world scenarios. Code will be made publicly available at https://github.com/ybkscht/EfficientPose.