论文标题
在对抗土匪中,内存受限的无重格学习
Memory-Constrained No-Regret Learning in Adversarial Bandits
论文作者
论文摘要
研究只能存储武器子集的统计信息,研究了带有内存约束的对抗性匪徒问题。一项层次学习策略仅需要根据武器数量来制定标准的记忆空间顺序。它在时间范围内的统一遗憾的命令是为了使遗憾和遗憾转移而建立。这项工作似乎是在对抗设置下的内存约束匪徒问题的第一项。
An adversarial bandit problem with memory constraints is studied where only the statistics of a subset of arms can be stored. A hierarchical learning policy that requires only a sublinear order of memory space in terms of the number of arms is developed. Its sublinear regret orders with respect to the time horizon are established for both weak regret and shifting regret. This work appears to be the first on memory-constrained bandit problems under the adversarial setting.