合作的多代理匪徒，尾巴很重

论文标题

合作的多代理匪徒，尾巴很重

Cooperative Multi-Agent Bandits with Heavy Tails

论文作者

Dubey, Abhimanyu, Pentland, Alex

论文摘要

我们在合作的多代理环境中研究了重尾随机匪徒问题，其中一组代理会与常见的强盗问题相互作用，同时在具有延迟的网络上进行交流。在这种情况下，随机匪徒的现有算法利用置信区间是由一种基于平均的通信协议（称为〜\ textit {运行共识}}引起的，该协议并不适合对重尾设置的强大估计。我们提出了\ textsc {MP-ucb}，这是一种用于合作随机强盗的分散的多代理算法，将强大的估计与消息通讯协议结合在一起。对于\ textsc {mp-ucb}，我们证明了几种问题设置的最佳遗憾界限，并且还证明了其优越性与现有方法的优势。此外，除了为位置稳健的匪徒估算提供有效的算法外，我们还建立了合作匪徒问题的第一个下限。

We study the heavy-tailed stochastic bandit problem in the cooperative multi-agent setting, where a group of agents interact with a common bandit problem, while communicating on a network with delays. Existing algorithms for the stochastic bandit in this setting utilize confidence intervals arising from an averaging-based communication protocol known as~\textit{running consensus}, that does not lend itself to robust estimation for heavy-tailed settings. We propose \textsc{MP-UCB}, a decentralized multi-agent algorithm for the cooperative stochastic bandit that incorporates robust estimation with a message-passing protocol. We prove optimal regret bounds for \textsc{MP-UCB} for several problem settings, and also demonstrate its superiority to existing methods. Furthermore, we establish the first lower bounds for the cooperative bandit problem, in addition to providing efficient algorithms for robust bandit estimation of location.

下载PDF全文

下载文献需遵守相关版权规定

论文标题