无线网络中资源约束多类调度的深度强化学习

论文标题

无线网络中资源约束多类调度的深度强化学习

Deep Reinforcement Learning for Resource Constrained Multiclass Scheduling in Wireless Networks

论文作者

Avranas, Apostolos, Kountouris, Marios, Ciblat, Philippe

论文摘要

这里考虑了在动态和异质无线设置中的资源约束计划的问题。在我们的设置中，分配了有限的带宽资源，以便满足随机到达的服务需求，这又在有效载荷数据要求，延迟公差和重要性/优先级方面属于不同类别。除了异质的流量外，另一个重大挑战还来自由于时间变化的无线通信渠道而导致的随机服务率。可以使用各种调度和资源分配的方法，包括简单的贪婪启发式方法和限制优化到组合学。这些方法是针对特定网络或应用程序配置量身定制的，通常是次优的。为此，我们求助于深入的增强学习（DRL），并提出了分配深层确定性的策略梯度（DDPG）算法，并结合了深度集，以解决上述问题。此外，我们提出了一种使用决斗网络的新颖方式，从而导致进一步的性能提高。我们提出的算法对合成数据和真实数据都进行了测试，显示出对组合，优化和调度指标的最新常规方法的一致收益。

The problem of resource constrained scheduling in a dynamic and heterogeneous wireless setting is considered here. In our setup, the available limited bandwidth resources are allocated in order to serve randomly arriving service demands, which in turn belong to different classes in terms of payload data requirement, delay tolerance, and importance/priority. In addition to heterogeneous traffic, another major challenge stems from random service rates due to time-varying wireless communication channels. Various approaches for scheduling and resource allocation can be used, ranging from simple greedy heuristics and constrained optimization to combinatorics. Those methods are tailored to specific network or application configuration and are usually suboptimal. To this purpose, we resort to deep reinforcement learning (DRL) and propose a distributional Deep Deterministic Policy Gradient (DDPG) algorithm combined with Deep Sets to tackle the aforementioned problem. Furthermore, we present a novel way to use a Dueling Network, which leads to further performance improvement. Our proposed algorithm is tested on both synthetic and real data, showing consistent gains against state-of-the-art conventional methods from combinatorics, optimization, and scheduling metrics.

下载PDF全文

下载文献需遵守相关版权规定

论文标题