一种简单而有效的算法，用于异步联合上下文线性匪徒

论文标题

一种简单而有效的算法，用于异步联合上下文线性匪徒

A Simple and Provably Efficient Algorithm for Asynchronous Federated Contextual Linear Bandits

论文作者

He, Jiafan, Wang, Tianhao, Min, Yifei, Gu, Quanquan

论文摘要

我们研究联合的上下文线性土匪，其中$ m $代理相互合作，借助中央服务器解决全球上下文线性匪徒问题。我们考虑了异步设置，所有代理商都独立工作，一个代理和服务器之间的通信不会触发其他代理的通信。我们提出了一种基于乐观原理的简单算法\ texttt {fedlinucb}。我们证明，\ texttt {fedlinucb}的遗憾是由$ \ tilde {o}（d \ sqrt {\ sum_ {m = 1}^m t_m}）$限制的，通信复杂性是$ \ tilde {o}（o}（dm^2）$，$ d是$ d $ tectement $ tectemential的$ dectement $ tectemention $ d $ tectemention $ tectemention $ tectemention $ tectemential $ tectemential $ textect and $。 $ m $ th代理商的环境。据我们所知，这是第一个可证明有效的算法，它允许联合上下文线性匪徒完全异步通信，同时实现了与单一代理设置相同的遗憾保证。

We study federated contextual linear bandits, where $M$ agents cooperate with each other to solve a global contextual linear bandit problem with the help of a central server. We consider the asynchronous setting, where all agents work independently and the communication between one agent and the server will not trigger other agents' communication. We propose a simple algorithm named \texttt{FedLinUCB} based on the principle of optimism. We prove that the regret of \texttt{FedLinUCB} is bounded by $\tilde{O}(d\sqrt{\sum_{m=1}^M T_m})$ and the communication complexity is $\tilde{O}(dM^2)$, where $d$ is the dimension of the contextual vector and $T_m$ is the total number of interactions with the environment by $m$-th agent. To the best of our knowledge, this is the first provably efficient algorithm that allows fully asynchronous communication for federated contextual linear bandits, while achieving the same regret guarantee as in the single-agent setting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题