部分可观测时空混沌系统的无模型预测

论文标题

部分可观测时空混沌系统的无模型预测

Federated Online Sparse Decision Making

论文作者

Wang, Chi-Hua, Li, Wenjie, Cheng, Guang, Lin, Guang

论文摘要

本文介绍了一种新颖的联合线性上下文匪徒模型，该模型在该模型中，各个客户面临具有高维决策上下文的不同K臂随机匪徒，并通过共同的全局参数结合。通过利用线性奖励的稀疏结构，提出了一种称为\ texttt {fedego lasso}的协作算法，以应对客户之间的异质性，而无需交换本地决策上下文矢量或原始奖励数据。 \ texttt {fedego lasso}依靠一种新颖的多客户团队合作 - 坦率的匪徒政策设计，并为以对数沟通成本的共享参数案例实现了近乎最佳的遗憾。此外，还引入了一种称为联邦式政策的新概念工具，以划定探索探索折衷。实验证明了所提出的算法对合成数据集和现实数据集的有效性。

This paper presents a novel federated linear contextual bandits model, where individual clients face different K-armed stochastic bandits with high-dimensional decision context and coupled through common global parameters. By leveraging the sparsity structure of the linear reward , a collaborative algorithm named \texttt{Fedego Lasso} is proposed to cope with the heterogeneity across clients without exchanging local decision context vectors or raw reward data. \texttt{Fedego Lasso} relies on a novel multi-client teamwork-selfish bandit policy design, and achieves near-optimal regrets for shared parameter cases with logarithmic communication costs. In addition, a new conceptual tool called federated-egocentric policies is introduced to delineate exploration-exploitation trade-off. Experiments demonstrate the effectiveness of the proposed algorithms on both synthetic and real-world datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题