基于不确定性感知的基于模型的强化学习

论文标题

基于不确定性感知的基于模型的强化学习

Uncertainty-aware Contact-safe Model-based Reinforcement Learning

论文作者

Kuo, Cheng-Yu, Schaarschmidt, Andreas, Cui, Yunduan, Asfour, Tamim, Matsubara, Takamitsu

论文摘要

这封信为在学习过程中实现接触安全行为的机器人应用提供了基于触摸保护模型的强化学习（MBRL）。在典型的MBRL中，我们不能指望数据驱动的模型可以在由于样本稀缺性而导致的学习过程中对预期的机器人任务产生准确而可靠的政策。在接触良好的环境中运行这些不可靠的政策可能会损害机器人及其周围环境。为了减轻通过意外密集的物理接触造成损害的风险，我们提出了将概率模型预测控制限制（PMPC）控制限制与模型不确定性相关联的接触安全MBRL，以便根据学习进度调整允许的控制行为加速。使用计算有效的近似GP动力学和近似推理技术，具有这种不确定性感知控制限制的控制计划被称为确定性的MPC问题。通过将碗与模拟和真实的机器人混合在一起，用真正的机器人将任务作为接触技巧的示例来评估我们的方法的有效性。（视频：https：//youtu.be/sdhhhhhhyi0）

This letter presents contact-safe Model-based Reinforcement Learning (MBRL) for robot applications that achieves contact-safe behaviors in the learning process. In typical MBRL, we cannot expect the data-driven model to generate accurate and reliable policies to the intended robotic tasks during the learning process due to sample scarcity. Operating these unreliable policies in a contact-rich environment could cause damage to the robot and its surroundings. To alleviate the risk of causing damage through unexpected intensive physical contacts, we present the contact-safe MBRL that associates the probabilistic Model Predictive Control's (pMPC) control limits with the model uncertainty so that the allowed acceleration of controlled behavior is adjusted according to learning progress. Control planning with such uncertainty-aware control limits is formulated as a deterministic MPC problem using a computation-efficient approximated GP dynamics and an approximated inference technique. Our approach's effectiveness is evaluated through bowl mixing tasks with simulated and real robots, scooping tasks with a real robot as examples of contact-rich manipulation skills. (video: https://youtu.be/sdhHP3NhYi0)

下载PDF全文

下载文献需遵守相关版权规定

论文标题