论文标题
同时使用多臂匪徒在多BSS WLAN中使用多臂匪徒的分散通道分配和接入点选择
Concurrent Decentralized Channel Allocation and Access Point Selection using Multi-Armed Bandits in multi BSS WLANs
论文作者
论文摘要
企业无线局域网(WLAN)由覆盖给定区域的多个访问点(AP)组成。鉴于APS和电台之间的复杂依赖关系,找到能够最大化企业WLAN的性能的合适网络配置是一项具有挑战性的任务。最近,在无线网络中,使用增强学习技术的使用已成为有效探索不同网络配置在系统性能中的影响的有效解决方案,从而确定了提供更好性能的网络性能。在本文中,我们研究了是否能够为企业WLAN场景中的分散通道分配和AP选择问题提供可行的解决方案。为此,我们将APS和电台赋予了代理,通过实施汤普森采样算法,探索和学习哪个是使用哪种最佳渠道,哪个是最佳关联的AP。我们的评估是在随机生成的场景上进行的,该方案包含不同的网络拓扑和流量负载。提出的结果表明,使用MAB的提议的自适应框架均优于静态方法(即,无论网络密度和流量要求如何,使用始终使用初始默认配置,通常是随机的)。此外,我们表明所提出的框架的使用降低了不同方案之间的性能变异性。结果还表明,与相同数量的站点相比,我们的AP较少的静态策略相比,我们取得了相同的性能(或更好)。最后,特别关注代理的相互作用。即使代理商以完全独立的方式运作,他们的决策也会相互关联,因为他们采取了相同的渠道资源的行动。
Enterprise Wireless Local Area Networks (WLANs) consist of multiple Access Points (APs) covering a given area. Finding a suitable network configuration able to maximize the performance of enterprise WLANs is a challenging task given the complex dependencies between APs and stations. Recently, in wireless networking, the use of reinforcement learning techniques has emerged as an effective solution to efficiently explore the impact of different network configurations in the system performance, identifying those that provide better performance. In this paper, we study if Multi-Armed Bandits (MABs) are able to offer a feasible solution to the decentralized channel allocation and AP selection problems in Enterprise WLAN scenarios. To do so, we empower APs and stations with agents that, by means of implementing the Thompson sampling algorithm, explore and learn which is the best channel to use, and which is the best AP to associate, respectively. Our evaluation is performed over randomly generated scenarios, which enclose different network topologies and traffic loads. The presented results show that the proposed adaptive framework using MABs outperform the static approach (i.e., using always the initial default configuration, usually random) regardless of the network density and the traffic requirements. Moreover, we show that the use of the proposed framework reduces the performance variability between different scenarios. Results also show that we achieve the same performance (or better) than static strategies with less APs for the same number of stations. Finally, special attention is placed on how the agents interact. Even if the agents operate in a completely independent manner, their decisions have interrelated effects, as they take actions over the same set of channel resources.