论文标题

部分可观测时空混沌系统的无模型预测

Federated Online Sparse Decision Making

论文作者

Wang, Chi-Hua, Li, Wenjie, Cheng, Guang, Lin, Guang

论文摘要

本文介绍了一种新颖的联合线性上下文匪徒模型,该模型在该模型中,各个客户面临具有高维决策上下文的不同K臂随机匪徒,并通过共同的全局参数结合。通过利用线性奖励的稀疏结构,提出了一种称为\ texttt {fedego lasso}的协作算法,以应对客户之间的异质性,而无需交换本地决策上下文矢量或原始奖励数据。 \ texttt {fedego lasso}依靠一种新颖的多客户团队合作 - 坦率的匪徒政策设计,并为以对数沟通成本的共享参数案例实现了近乎最佳的遗憾。此外,还引入了一种称为联邦式政策的新概念工具,以划定探索探索折衷。实验证明了所提出的算法对合成数据集和现实数据集的有效性。

This paper presents a novel federated linear contextual bandits model, where individual clients face different K-armed stochastic bandits with high-dimensional decision context and coupled through common global parameters. By leveraging the sparsity structure of the linear reward , a collaborative algorithm named \texttt{Fedego Lasso} is proposed to cope with the heterogeneity across clients without exchanging local decision context vectors or raw reward data. \texttt{Fedego Lasso} relies on a novel multi-client teamwork-selfish bandit policy design, and achieves near-optimal regrets for shared parameter cases with logarithmic communication costs. In addition, a new conceptual tool called federated-egocentric policies is introduced to delineate exploration-exploitation trade-off. Experiments demonstrate the effectiveness of the proposed algorithms on both synthetic and real-world datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源