论文标题

无人机的合作互联网:通过多代理深入强化学习的分布式轨迹设计

Cooperative Internet of UAVs: Distributed Trajectory Design by Multi-agent Deep Reinforcement Learning

论文作者

Hu, Jingzhi, Zhang, Hongliang, Song, Lingyang, Schober, Robert, Poor, H. Vincent

论文摘要

由于灵活部署和广泛的覆盖范围的优势,无人驾驶飞机(UAV)具有在下一代蜂窝网络中感应应用的巨大潜力,这将产生无人机的蜂窝网络。在本文中,我们考虑了无人机的蜂窝互联网,无人机通过合作感测和传输执行感知任务,以最大程度地减少信息时代(AOI)。但是,合作的传感和传输与无人机的轨迹紧密结合,这使得轨迹设计具有挑战性。为了应对这一挑战,我们提出了一个分布式的感官和日期协议,在该协议中,无人机通过从离散的任务集和连续的一组传感和传输位置选择来确定轨迹。基于此协议,我们制定了AOI最小化的轨迹设计问题,并提出了一种复合行动参与者 - 批评(CA2C)算法,以基于深度强化学习来解决它。 CA2C算法可以学习涉及连续变量和离散变量的动作的最佳策略,并且适用于轨迹设计。 {我们的仿真结果表明,CA2C算法的表现优于四个基线算法}。此外,我们表明,通过分配任务,与非合作无人机相比,合作无人机可以达到较低的AOI。

Due to the advantages of flexible deployment and extensive coverage, unmanned aerial vehicles (UAVs) have great potential for sensing applications in the next generation of cellular networks, which will give rise to a cellular Internet of UAVs. In this paper, we consider a cellular Internet of UAVs, where the UAVs execute sensing tasks through cooperative sensing and transmission to minimize the age of information (AoI). However, the cooperative sensing and transmission is tightly coupled with the UAVs' trajectories, which makes the trajectory design challenging. To tackle this challenge, we propose a distributed sense-and-send protocol, where the UAVs determine the trajectories by selecting from a discrete set of tasks and a continuous set of locations for sensing and transmission. Based on this protocol, we formulate the trajectory design problem for AoI minimization and propose a compound-action actor-critic (CA2C) algorithm to solve it based on deep reinforcement learning. The CA2C algorithm can learn the optimal policies for actions involving both continuous and discrete variables and is suited for the trajectory design. {Our simulation results show that the CA2C algorithm outperforms four baseline algorithms}. Also, we show that by dividing the tasks, cooperative UAVs can achieve a lower AoI compared to non-cooperative UAVs.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源