星际轮艇的混合加固学习：基于相移模型的耦合波束形式

论文标题

星际轮艇的混合加固学习：基于相移模型的耦合波束形式

Hybrid Reinforcement Learning for STAR-RISs: A Coupled Phase-Shift Model Based Beamformer

论文作者

Zhong, Ruikang, Liu, Yuanwei, Mu, Xidong, Chen, Yue, Wang, Xianbin, Hanzo, Lajos

论文摘要

研究了一种同时传输和反映可重构的智能表面（星形）辅助多用户下行链路多输入单输出（MISO）通信系统。与假设具有独立的传输和反射相移控制的现有理想的星-RIS模型相反，考虑了实用的耦合相移模型。然后，制定了一个关节主动和被动的优化问题，以最大程度地降低长期传输功耗，但要受耦合的相移约束和最小数据速率约束。尽管相移模型的耦合性质，但通过调用混合连续和离散的相移控制策略来解决该法式问题。受到这一观察的启发，一对混合增强算法（RL）算法，即基于混合的深层确定性策略梯度（Hybrid DDPG）算法和基于DDPG＆DEEP-DQN联合DDPG＆DEED-DQN联合算法的算法。混合DDPG算法通过依靠混合动作映射来控制相关的高维连续和离散作用。相比之下，联合DDPG-DQN算法构建了两个马尔可夫决策过程（MDP），依靠内部和外部环境，从而使两种代理合并以完成关节混合控制。模拟结果表明，就其能源消耗而言，星际 - 里斯与其他常规RIS相比具有优越性。此外，所提出的算法的表现都优于基线DDPG算法，而联合DDPG-DQN算法则取得了出色的性能，尽管计算复杂性的增长。

A simultaneous transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted multi-user downlink multiple-input single-output (MISO) communication system is investigated. In contrast to the existing ideal STAR-RIS model assuming an independent transmission and reflection phase-shift control, a practical coupled phase-shift model is considered. Then, a joint active and passive beamforming optimization problem is formulated for minimizing the long-term transmission power consumption, subject to the coupled phase-shift constraint and the minimum data rate constraint. Despite the coupled nature of the phase-shift model, the formulated problem is solved by invoking a hybrid continuous and discrete phase-shift control policy. Inspired by this observation, a pair of hybrid reinforcement learning (RL) algorithms, namely the hybrid deep deterministic policy gradient (hybrid DDPG) algorithm and the joint DDPG & deep-Q network (DDPG-DQN) based algorithm are proposed. The hybrid DDPG algorithm controls the associated high-dimensional continuous and discrete actions by relying on the hybrid action mapping. By contrast, the joint DDPG-DQN algorithm constructs two Markov decision processes (MDPs) relying on an inner and an outer environment, thereby amalgamating the two agents to accomplish a joint hybrid control. Simulation results demonstrate that the STAR-RIS has superiority over other conventional RISs in terms of its energy consumption. Furthermore, both the proposed algorithms outperform the baseline DDPG algorithm, and the joint DDPG-DQN algorithm achieves a superior performance, albeit at an increased computational complexity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题