论文标题
在动荡的环境中,自我销售剂的迁移,能耗最少
Migration of self-propelling agent in a turbulent environment with minimal energy consumption
论文作者
论文摘要
我们提出了一项培训自我销售剂的数值研究,以在不稳定的流动环境中迁移。我们通过采用加强学习算法来最大程度地减少能源消耗来控制代理来利用背景流结构。我们考虑了两种类型的流量的药物:一种是简单的期刊双胃流作为概念验证的例子,而另一个是复杂的湍流雷利 - 纳德对流,是在对流气氛或海洋中迁移的范式。结果表明,两个流中的智能代理都可以学会从一个位置迁移到另一个位置,同时使用背景流动电流尽可能最大程度地减少能源消耗,这可以通过将智能代理与幼稚的代理进行比较,从而直接从原点移至目的地。此外,我们发现,与双重流动相比,动荡的雷利 - 贝纳德对流中的流场表现出更大的波动,训练剂更有可能探索不同的迁移策略。因此,训练过程更难收敛。但是,我们仍然可以确定与代理商获得最高奖励的策略相对应的节能轨迹。这些结果对许多移民问题具有重要意义,例如在动荡的对流环境中飞行的无人飞行器,在这种环境中经常涉及节能轨迹。
We present a numerical study of training a self-propelling agent to migrate in the unsteady flow environment. We control the agent to utilize the background flow structure by adopting the reinforcement learning algorithm to minimize energy consumption. We considered the agent migrating in two types of flows: one is simple periodical double-gyre flow as a proof-of-concept example, while the other is complex turbulent Rayleigh-Bénard convection as a paradigm for migrating in the convective atmosphere or the ocean. The results show that the smart agent in both flows can learn to migrate from one position to another while utilizing background flow currents as much as possible to minimize the energy consumption, which is evident by comparing the smart agent with a naive agent that moves straight from the origin to the destination. In addition, we found that compared to the double-gyre flow, the flow field in the turbulent Rayleigh-Bénard convection exhibits more substantial fluctuations, and the training agent is more likely to explore different migration strategies; thus, the training process is more difficult to converge. Nevertheless, we can still identify an energy-efficient trajectory that corresponds to the strategy with the highest reward received by the agent. These results have important implications for many migration problems such as unmanned aerial vehicles flying in a turbulent convective environment, where planning energy-efficient trajectories are often involved.