用于处理各种机器人操纵器任务的非政策深钢筋学习算法

论文标题

用于处理各种机器人操纵器任务的非政策深钢筋学习算法

Off-Policy Deep Reinforcement Learning Algorithms for Handling Various Robotic Manipulator Tasks

论文作者

Rzayev, Altun, Aghaei, Vahid Tavakol

论文摘要

为了避免由于系统的复杂性和对数据密度的强烈需求而造成障碍的常规控制方法，需要开发现代，更有效的控制方法。通过这种方式，增强学习非政策和无模型算法有助于避免使用复杂的模型。在速度和准确性方面，它们成为突出的方法，因为算法利用他们过去的经验来学习最佳政策。在这项研究中，三种强化学习算法； DDPG，TD3和SAC已用于训练Fetch机器人操作器，以在Mujoco模拟环境中完成四个不同的任务。所有这些算法都是非政策的，能够通过优化政策和价值功能来实现其所需目标。在当前的研究中，在受控环境中分析了这三种算法的效率和速度。

In order to avoid conventional controlling methods which created obstacles due to the complexity of systems and intense demand on data density, developing modern and more efficient control methods are required. In this way, reinforcement learning off-policy and model-free algorithms help to avoid working with complex models. In terms of speed and accuracy, they become prominent methods because the algorithms use their past experience to learn the optimal policies. In this study, three reinforcement learning algorithms; DDPG, TD3 and SAC have been used to train Fetch robotic manipulator for four different tasks in MuJoCo simulation environment. All of these algorithms are off-policy and able to achieve their desired target by optimizing both policy and value functions. In the current study, the efficiency and the speed of these three algorithms are analyzed in a controlled environment.

下载PDF全文

下载文献需遵守相关版权规定

论文标题