同时培训控制政策和状态估计器的动态和健壮的腿部运动

论文标题

同时培训控制政策和状态估计器的动态和健壮的腿部运动

Concurrent Training of a Control Policy and a State Estimator for Dynamic and Robust Legged Locomotion

论文作者

Ji, Gwanghyeon, Mun, Juhyeok, Kim, Hyeongjun, Hwangbo, Jemin

论文摘要

在本文中，我们提出了一个运动训练框架，其中同时对控制政策和州估计器进行了培训。该框架由一个策略网络组成，该策略网络输出所需的联合位置和状态估计网络，该网络输出机器人状态的估计值，例如基础线性速度，脚高度和接触概率。我们利用快速的模拟环境来训练网络，并将训练有素的网络转移到真正的机器人。受过训练的政策和州估计器能够穿越各种地形，例如山丘，滑板和颠簸的路。我们还证明，学到的政策可以在正常平坦地面上以高达3.75 m/s的速度运行，在滑板上的3.54 m/s，摩擦系数为0.22。

In this paper, we propose a locomotion training framework where a control policy and a state estimator are trained concurrently. The framework consists of a policy network which outputs the desired joint positions and a state estimation network which outputs estimates of the robot's states such as the base linear velocity, foot height, and contact probability. We exploit a fast simulation environment to train the networks and the trained networks are transferred to the real robot. The trained policy and state estimator are capable of traversing diverse terrains such as a hill, slippery plate, and bumpy road. We also demonstrate that the learned policy can run at up to 3.75 m/s on normal flat ground and 3.54 m/s on a slippery plate with the coefficient of friction of 0.22.

下载PDF全文

下载文献需遵守相关版权规定

论文标题