SSP-NET：实时3D人姿势回归的可扩展顺序金字塔网络

论文标题

SSP-NET：实时3D人姿势回归的可扩展顺序金字塔网络

SSP-Net: Scalable Sequential Pyramid Networks for Real-Time 3D Human Pose Regression

论文作者

Luvizon, Diogo, Tabia, Hedi, Picard, David

论文摘要

在本文中，我们提出了一个高度可扩展的卷积神经网络，端到端可训练，以实时从静止的RGB图像进行实时的3D人姿势回归。我们将此方法称为可扩展的顺序金字塔网络（SSP-NET），因为它以多个尺度的顺序训练。我们的网络需要一个单一的培训程序，并且能够以每秒120帧（FPS）的120帧或在测试时削减时以超过200 fps的可接受预测产生其最佳预测。我们表明，所提出的回归方法与特征图的大小不变，从而使我们的方法可以执行多分辨率的中间监督，并达到与具有非常低分辨率特征图的最先进的结果相当的结果。我们通过在两个最重要的公共可用数据集上提供3D姿势估计的最重要的公共可用数据集的广泛实验来证明我们方法的准确性和有效性，Human 36M和MPI-INF-3DHP。此外，我们还提供有关我们在网络体系结构的决策的相关见解，并显示其灵活性以满足最佳的速度折衷。

In this paper we propose a highly scalable convolutional neural network, end-to-end trainable, for real-time 3D human pose regression from still RGB images. We call this approach the Scalable Sequential Pyramid Networks (SSP-Net) as it is trained with refined supervision at multiple scales in a sequential manner. Our network requires a single training procedure and is capable of producing its best predictions at 120 frames per second (FPS), or acceptable predictions at more than 200 FPS when cut at test time. We show that the proposed regression approach is invariant to the size of feature maps, allowing our method to perform multi-resolution intermediate supervisions and reaching results comparable to the state-of-the-art with very low resolution feature maps. We demonstrate the accuracy and the effectiveness of our method by providing extensive experiments on two of the most important publicly available datasets for 3D pose estimation, Human3.6M and MPI-INF-3DHP. Additionally, we provide relevant insights about our decisions on the network architecture and show its flexibility to meet the best precision-speed compromise.

下载PDF全文

下载文献需遵守相关版权规定

论文标题