论文标题
线性干扰匪:非固定数字干扰的样品效率学习
Linear Jamming Bandits: Sample-Efficient Learning for Non-Coherent Digital Jamming
论文作者
论文摘要
已经表明(Amuru等人,2015年),可以有效地使用在线学习算法选择最佳的物理层参数,以与数字调制方案联系起来,而无需先验地了解受害者的传播策略。但是,这个学习问题涉及解决一个可以非常大的混合动作空间的多军强盗问题。结果,与最佳干扰策略的融合可能会很慢,尤其是当受害者和干扰器的符号不是完全同步时。在这项工作中,我们通过引入线性匪徒算法来解决样本效率问题,该算法说明了动作之间固有的相似性。此外,我们提出了上下文特征,这些特征非常适合非连锁性干扰问题的统计特征,并且与先前的ART相比,表现出明显改善的收敛行为。此外,我们还展示了如何将有关受害者传输的先验知识无缝整合到学习框架中。我们最终讨论了渐近状态的局限性。
It has been shown (Amuru et al. 2015) that online learning algorithms can be effectively used to select optimal physical layer parameters for jamming against digital modulation schemes without a priori knowledge of the victim's transmission strategy. However, this learning problem involves solving a multi-armed bandit problem with a mixed action space that can grow very large. As a result, convergence to the optimal jamming strategy can be slow, especially when the victim and jammer's symbols are not perfectly synchronized. In this work, we remedy the sample efficiency issues by introducing a linear bandit algorithm that accounts for inherent similarities between actions. Further, we propose context features which are well-suited for the statistical features of the non-coherent jamming problem and demonstrate significantly improved convergence behavior compared to the prior art. Additionally, we show how prior knowledge about the victim's transmissions can be seamlessly integrated into the learning framework. We finally discuss limitations in the asymptotic regime.