毫米波网络中的链接配置的多臂匪徒

论文标题

毫米波网络中的链接配置的多臂匪徒

Multi-armed Bandits for Link Configuration in Millimeter-wave Networks

论文作者

Zhang, Yi, Heath Jr, Robert W.

论文摘要

由于环境的变化以及MMWave信号对用户移动性和渠道条件的高度敏感性，建立和维护毫米波（MMWAVE）链接是具有挑战性的。 MMWave链接配置问题通常涉及在环境不确定性下搜索最佳系统参数，这是从系统硬件和协议支持的一组有限的替代方案中搜索。例如，横梁旨在识别从离散代码手册中数据传输的最佳光束。选择参数（例如横梁清扫周期和梁宽）对于实现高整体系统吞吐量至关重要。在本文中，我们激励使用多臂强盗（MAB）框架在建立MMWave链接时智能搜索最佳配置。 MAB是一个强化学习框架，可指导决策者从一组动作中依次选择一个动作。例如，我们表明，在MAB框架中，可以通过样品计算有效的匪徒B型算法动态学习最佳的横梁清除周期，束宽和梁方向。最后，我们强调了一些未来的研究指示，以增强MMWave链接配置设计。

Establishing and maintaining millimeter-wave (mmWave) links is challenging due to the changing environment and the high sensibility of mmWave signal to user mobility and channel conditions. MmWave link configuration problems often involve a search for optimal system parameter under environmental uncertainties, from a finite set of alternatives that are supported by the system hardware and protocol. For example, beam sweeping aims at identifying the optimal beam(s) for data transmission from a discrete codebook. Selecting parameters such as the beam sweeping period and the beamwidth are crucial to achieving high overall system throughput. In this article, we motivate the use of the multi-armed bandit (MAB) framework to intelligently search out the optimal configuration when establishing the mmWave links. MAB is a reinforcement learning framework that guides a decision-maker to sequentially select one action from a set of actions. As an example, we show that within the MAB framework, the optimal beam sweeping period, beamwidth, and beam directions could be dynamically learned with sample-computational-efficient bandit algorithms. We conclude by highlighting some future research directions on enhancing mmWave link configuration design with MAB.

下载PDF全文

下载文献需遵守相关版权规定

论文标题