在AI足球环境中进行DQN的踢动作训练

论文标题

在AI足球环境中进行DQN的踢动作训练

Kick-motion Training with DQN in AI Soccer Environment

论文作者

Park, Bumgeun, Lee, Jihui, Kim, Taeyoung, Har, Dongsoo

论文摘要

本文提出了一种通过使用强化学习（RL）来训练机器人在AI足球中进行踢球的技术。在RL中，代理与环境进行互动，并学会在每个步骤中选择一个状态的动作。当训练RL算法训练时，如果状态的维度很高并且训练数据的数量很少，则可能会出现称为维度诅咒（COD）的问题。鳕鱼通常会导致RL模型的性能降低。在机器人踢球的情况下，当球接近机器人时，机器人根据从足球场获得的信息选择了动作。为了不遭受COD，应从足球场的所有领域（理论上无限）时间均匀地收集训练数据，这是RL的经验。在本文中，我们试图将相对坐标系（RC）用作机器人代理训练踢运动的状态，而不是使用绝对坐标系（ACS）。使用RCS消除了代理商了解整个足球场的所有（状态）信息的必要性，并降低了代理商需要知道执行踢球的国家的维度，从而减轻了COD。基于RCS的培训是使用广泛使用的深Q-Network（DQN）进行的，并在使用Webots模拟软件实施的AI足球环境中进行了测试。

This paper presents a technique to train a robot to perform kick-motion in AI soccer by using reinforcement learning (RL). In RL, an agent interacts with an environment and learns to choose an action in a state at each step. When training RL algorithms, a problem called the curse of dimensionality (COD) can occur if the dimension of the state is high and the number of training data is low. The COD often causes degraded performance of RL models. In the situation of the robot kicking the ball, as the ball approaches the robot, the robot chooses the action based on the information obtained from the soccer field. In order not to suffer COD, the training data, which are experiences in the case of RL, should be collected evenly from all areas of the soccer field over (theoretically infinite) time. In this paper, we attempt to use the relative coordinate system (RCS) as the state for training kick-motion of robot agent, instead of using the absolute coordinate system (ACS). Using the RCS eliminates the necessity for the agent to know all the (state) information of entire soccer field and reduces the dimension of the state that the agent needs to know to perform kick-motion, and consequently alleviates COD. The training based on the RCS is performed with the widely used Deep Q-network (DQN) and tested in the AI Soccer environment implemented with Webots simulation software.

下载PDF全文

下载文献需遵守相关版权规定

论文标题