论文标题
使用演员 - 批评深度加强学习自动控制PID控制
Autotuning PID control using Actor-Critic Deep Reinforcement Learning
论文作者
论文摘要
这项工作是一项探索性研究,涉及确定可以使用什么方式来预测用于苹果收获的机器人的最佳PID参数。为了研究这一点,在模拟机器人部门实施了一种称为Advantage Actor评论家(A2C)的算法。模拟主要依赖于ROS框架。一次调整一个执行器和两个执行器A时间的实验,这两者都表明该模型能够预测比设定基线更好的PID增益。此外,如果模型能够根据Apple的位置预测PID参数,则研究了它。初始测试表明,该模型确实能够将其预测调整到Apple位置,从而使其成为自适应控制器。
This work is an exploratory research concerned with determining in what way reinforcement learning can be used to predict optimal PID parameters for a robot designed for apple harvest. To study this, an algorithm called Advantage Actor Critic (A2C) is implemented on a simulated robot arm. The simulation primarily relies on the ROS framework. Experiments for tuning one actuator at a time and two actuators a a time are run, which both show that the model is able to predict PID gains that perform better than the set baseline. In addition, it is studied if the model is able to predict PID parameters based on where an apple is located. Initial tests show that the model is indeed able to adapt its predictions to apple locations, making it an adaptive controller.