火焰：基于语言的自由形式的运动综合和编辑

论文标题

火焰：基于语言的自由形式的运动综合和编辑

FLAME: Free-form Language-based Motion Synthesis & Editing

论文作者

Kim, Jihoon, Kim, Jiseob, Choi, Sungjoon

论文摘要

基于文本的运动生成模型正在引起人们对它们在游戏，动画或机器人行业中自动化运动过程的潜力的兴趣激增。在本文中，我们提出了一个基于扩散的运动合成和名为Flame的编辑模型。受扩散模型中最新成功的启发，我们将基于扩散的生成模型集成到运动域中。火焰可以产生与给定文本很好地对齐的高保真动作。此外，它可以编辑运动的各个部分，无论是在框架方面还是在联合方面，而无需进行任何微调。火焰涉及一种新的基于变压器的架构，我们设计了更好地处理运动数据，这对于管理可变长度动作和良好的自由形式文本至关重要。在实验中，我们表明，火焰在三个文本数据集上实现了最新的一代表演：HumanML3D，Babel和Kit。我们还证明，火焰的编辑能力可以扩展到其他任务，例如运动预测或运动中的运动，这些任务先前已被专用模型涵盖。

Text-based motion generation models are drawing a surge of interest for their potential for automating the motion-making process in the game, animation, or robot industries. In this paper, we propose a diffusion-based motion synthesis and editing model named FLAME. Inspired by the recent successes in diffusion models, we integrate diffusion-based generative models into the motion domain. FLAME can generate high-fidelity motions well aligned with the given text. Also, it can edit the parts of the motion, both frame-wise and joint-wise, without any fine-tuning. FLAME involves a new transformer-based architecture we devise to better handle motion data, which is found to be crucial to manage variable-length motions and well attend to free-form text. In experiments, we show that FLAME achieves state-of-the-art generation performances on three text-motion datasets: HumanML3D, BABEL, and KIT. We also demonstrate that editing capability of FLAME can be extended to other tasks such as motion prediction or motion in-betweening, which have been previously covered by dedicated models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题