动员：无见的对象的实时姿势估计，形状较弱

论文标题

动员：无见的对象的实时姿势估计，形状较弱

MobilePose: Real-Time Pose Estimation for Unseen Objects with Weak Shape Supervision

论文作者

Hou, Tingbo, Ahmadyan, Adel, Zhang, Liangkai, Wei, Jianing, Grundmann, Matthias

论文摘要

在本文中，我们解决了从RGB图像中检测看不见的对象并在3D中估算其姿势的问题。我们提出了两个移动友好的网络：动员基本和动员形状。前者只有在姿势监督的情况下使用，而后者则是在可用形状监督的情况下，即使是弱的。我们重新访问以前方法中使用的形状特征，包括分割和坐标图。我们解释了像素级形状的何时以及为什么可以改善姿势估计。因此，我们在动员形状的中间层中添加了形状预测，并让网络从形状中学习姿势。我们的模型接受了混合实际和合成数据的培训，并具有弱和嘈杂的形状监督。它们是非常轻巧的，可以在现代移动设备上实时运行（例如Galaxy S20上的36 fps）。与以前的单次解决方案相比，我们的方法具有更高的精度，而使用明显较小的模型（模型大小或参数数量为2〜3％）。

In this paper, we address the problem of detecting unseen objects from RGB images and estimating their poses in 3D. We propose two mobile friendly networks: MobilePose-Base and MobilePose-Shape. The former is used when there is only pose supervision, and the latter is for the case when shape supervision is available, even a weak one. We revisit shape features used in previous methods, including segmentation and coordinate map. We explain when and why pixel-level shape supervision can improve pose estimation. Consequently, we add shape prediction as an intermediate layer in the MobilePose-Shape, and let the network learn pose from shape. Our models are trained on mixed real and synthetic data, with weak and noisy shape supervision. They are ultra lightweight that can run in real-time on modern mobile devices (e.g. 36 FPS on Galaxy S20). Comparing with previous single-shot solutions, our method has higher accuracy, while using a significantly smaller model (2~3% in model size or number of parameters).

下载PDF全文

下载文献需遵守相关版权规定

论文标题