佳能：野外自制的单眼3D人姿势估计

论文标题

佳能：野外自制的单眼3D人姿势估计

CanonPose: Self-Supervised Monocular 3D Human Pose Estimation in the Wild

论文作者

Wandt, Bastian, Rudolph, Marco, Zell, Petrissa, Rhodin, Helge, Rosenhahn, Bodo

论文摘要

在计算机视觉中，人的姿势估计是一个具有挑战性的问题，需要大量的标记训练数据才能准确解决。不幸的是，对于许多人类活动（\例如户外运动），这种培训数据并不存在，并且很难或甚至不可能使用传统的运动捕获系统获得。我们提出了一种自我监督的方法，该方法从未标记的多视图数据中学习了单个图像3D姿势估计器。为此，我们利用多视图一致性约束将观察到的2D姿势拆开到基础3D姿势和摄像机旋转中。与大多数现有方法相反，我们不需要校准的相机，因此可以从移动相机中学习。但是，在静态相机设置的情况下，我们会提供可选的扩展名，以在我们的框架中包括多个视图上的恒定相机旋转。成功的关键是新的，公正的重建目标，这些目标将信息跨视图和培训样本混合在一起。在两个基准数据集（Human36M和MPII-INF-3DHP）和野外跳过数据集上评估了所提出的方法。

Human pose estimation from single images is a challenging problem in computer vision that requires large amounts of labeled training data to be solved accurately. Unfortunately, for many human activities (\eg outdoor sports) such training data does not exist and is hard or even impossible to acquire with traditional motion capture systems. We propose a self-supervised approach that learns a single image 3D pose estimator from unlabeled multi-view data. To this end, we exploit multi-view consistency constraints to disentangle the observed 2D pose into the underlying 3D pose and camera rotation. In contrast to most existing methods, we do not require calibrated cameras and can therefore learn from moving cameras. Nevertheless, in the case of a static camera setup, we present an optional extension to include constant relative camera rotations over multiple views into our framework. Key to the success are new, unbiased reconstruction objectives that mix information across views and training samples. The proposed approach is evaluated on two benchmark datasets (Human3.6M and MPII-INF-3DHP) and on the in-the-wild SkiPose dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题