temos：从文本描述中产生多样的人类动作

论文标题

temos：从文本描述中产生多样的人类动作

TEMOS: Generating diverse human motions from textual descriptions

论文作者

Petrovich, Mathis, Black, Michael J., Varol, Gül

论文摘要

我们解决了从文本描述中产生不同3D人类动作的问题。这项具有挑战性的任务需要两种方式的联合建模：理解和从文本中提取有用的以人为中心的信息，然后产生人类姿势的合理和现实序列。与大多数以前的工作相反，该作品着重于从文本描述中产生单一的，确定性的动作，我们设计了一种可以产生多种多样的人类动作的变异方法。我们提出了Temos，这是一种具有文本的生成模型，利用人体运动数据的变异自动编码器（VAE）培训，结合了与VAE潜在空间兼容的文本编码器结合使用的文本编码器。我们显示Temos框架可以像先前的工作一样产生基于骨架的动画，以及更具表现力的SMPL身体运动。我们在套件运动语言基准上评估了我们的方法，尽管相对简单，但对技术的状态表现出显着改善。代码和模型可在我们的网页上找到。

We address the problem of generating diverse 3D human motions from textual descriptions. This challenging task requires joint modeling of both modalities: understanding and extracting useful human-centric information from the text, and then generating plausible and realistic sequences of human poses. In contrast to most previous work which focuses on generating a single, deterministic, motion from a textual description, we design a variational approach that can produce multiple diverse human motions. We propose TEMOS, a text-conditioned generative model leveraging variational autoencoder (VAE) training with human motion data, in combination with a text encoder that produces distribution parameters compatible with the VAE latent space. We show the TEMOS framework can produce both skeleton-based animations as in prior work, as well more expressive SMPL body motions. We evaluate our approach on the KIT Motion-Language benchmark and, despite being relatively straightforward, demonstrate significant improvements over the state of the art. Code and models are available on our webpage.

下载PDF全文

下载文献需遵守相关版权规定

论文标题