Mocapact：用于模拟类人体控制的多任务数据集

论文标题

Mocapact：用于模拟类人体控制的多任务数据集

MoCapAct: A Multi-Task Dataset for Simulated Humanoid Control

论文作者

Wagener, Nolan, Kolobov, Andrey, Frujeri, Felipe Vieira, Loynd, Ricky, Cheng, Ching-An, Hausknecht, Matthew

论文摘要

由于其物理能力，模拟的类人动物是一个吸引人的研究领域。尽管如此，他们也在控制方面具有挑战性，因为政策必须推动不稳定，不连续和高维物理系统。一种经过广泛研究的方法是利用运动捕获（MOCAP）数据来教授类人动物的低水平技能（例如，站立，步行和跑步），然后可以将其重新使用以综合高级行为。但是，即使使用MOCAP数据，控制模拟的类人动物仍然非常困难，因为MOCAP数据仅提供运动学信息。找到物理控制输入以实现所证明的动议需要计算密集的方法，例如增强学习。因此，尽管有公开可用的MOCAP数据，但其效用仍限于具有大规模计算的机构。在这项工作中，我们通过培训和释放高质量的代理，可以大大降低有关该主题的生产研究的障碍，这些代理可以在基于DM_Control物理学的环境中跟踪超过三个小时的MOCAP数据的MOCAP数据。我们释放Mocapact（动作捕获运动），这些专家代理的数据集及其推出，其中包含本体感受的观察和动作。我们通过使用它来训练单个层次结构策略来证明MOCAPACT的实用性，能够在DM_Control中跟踪整个MOCAP数据集并显示学习的低级组件可以被重新使用以有效地学习下游高级任务。最后，我们使用Mocapact训练自回旋GPT模型，并表明它可以控制模拟的类人动物以在运动提示下执行自然运动完成。结果和指向代码和数据集的链接的视频可在https://microsoft.github.io/mocapact上找到。

Simulated humanoids are an appealing research domain due to their physical capabilities. Nonetheless, they are also challenging to control, as a policy must drive an unstable, discontinuous, and high-dimensional physical system. One widely studied approach is to utilize motion capture (MoCap) data to teach the humanoid agent low-level skills (e.g., standing, walking, and running) that can then be re-used to synthesize high-level behaviors. However, even with MoCap data, controlling simulated humanoids remains very hard, as MoCap data offers only kinematic information. Finding physical control inputs to realize the demonstrated motions requires computationally intensive methods like reinforcement learning. Thus, despite the publicly available MoCap data, its utility has been limited to institutions with large-scale compute. In this work, we dramatically lower the barrier for productive research on this topic by training and releasing high-quality agents that can track over three hours of MoCap data for a simulated humanoid in the dm_control physics-based environment. We release MoCapAct (Motion Capture with Actions), a dataset of these expert agents and their rollouts, which contain proprioceptive observations and actions. We demonstrate the utility of MoCapAct by using it to train a single hierarchical policy capable of tracking the entire MoCap dataset within dm_control and show the learned low-level component can be re-used to efficiently learn downstream high-level tasks. Finally, we use MoCapAct to train an autoregressive GPT model and show that it can control a simulated humanoid to perform natural motion completion given a motion prompt. Videos of the results and links to the code and dataset are available at https://microsoft.github.io/MoCapAct.

下载PDF全文

下载文献需遵守相关版权规定

论文标题