狙击手：用于同时多人3D姿势估计跟踪和预测视频片段上的时空变压器

论文标题

狙击手：用于同时多人3D姿势估计跟踪和预测视频片段上的时空变压器

Snipper: A Spatiotemporal Transformer for Simultaneous Multi-Person 3D Pose Estimation Tracking and Forecasting on a Video Snippet

论文作者

Zou, Shihao, Xu, Yuanlu, Li, Chao, Ma, Lingni, Cheng, Li, Vo, Minh

论文摘要

来自RGB视频的多人姿势理解涉及三个复杂的任务：姿势估计，跟踪和运动预测。直觉上，准确的多人姿势估计有助于稳健的跟踪，稳健的跟踪构建了至关重要的历史记录，以进行正确的运动预测。大多数现有的作品要么着重于单个任务，要么采用多阶段方法分别解决多个任务，这往往在每个阶段做出次优决策，并且也无法利用这三个任务之间的相关性。在本文中，我们提出了Snipper，这是一个统一的框架，用于在一个阶段同时进行多人3D姿势估计，跟踪和运动预测。我们提出了一种有效但强大的可变形注意机制，以从视频片段中汇总时空信息。在这种可变形的关注的基础上，学会了视频变压器从多框架片段中编码时空特征，并为多人姿势查询解码信息的姿势功能。最后，这些姿势查询被回归，以预测单镜头中的多人姿势轨迹和未来动作。在实验中，我们显示了狙击手对三个挑战性的公共数据集的有效性，在该数据集中，我们的通用模型竞争对手专门针对姿势估计，跟踪和预测的最先进基线。

Multi-person pose understanding from RGB videos involves three complex tasks: pose estimation, tracking and motion forecasting. Intuitively, accurate multi-person pose estimation facilitates robust tracking, and robust tracking builds crucial history for correct motion forecasting. Most existing works either focus on a single task or employ multi-stage approaches to solving multiple tasks separately, which tends to make sub-optimal decision at each stage and also fail to exploit correlations among the three tasks. In this paper, we propose Snipper, a unified framework to perform multi-person 3D pose estimation, tracking, and motion forecasting simultaneously in a single stage. We propose an efficient yet powerful deformable attention mechanism to aggregate spatiotemporal information from the video snippet. Building upon this deformable attention, a video transformer is learned to encode the spatiotemporal features from the multi-frame snippet and to decode informative pose features for multi-person pose queries. Finally, these pose queries are regressed to predict multi-person pose trajectories and future motions in a single shot. In the experiments, we show the effectiveness of Snipper on three challenging public datasets where our generic model rivals specialized state-of-art baselines for pose estimation, tracking, and forecasting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题