论文标题

费率受限的远程上下文匪徒

Rate-Constrained Remote Contextual Bandits

论文作者

Pase, Francesco, Gündüz, Deniz, Zorzi, Michele

论文摘要

我们考虑了一个受速率受限的上下文多臂强盗(RC-CMAB)问题,其中一组代理正在求解相同的上下文多臂匪(CMAB)问题。但是,上下文是由一个遥远连接的实体(即决策者)观察到的,该实体更新了策略以最大化返回的奖励,并通过使用费率限制的通信渠道将武器传达给代理商的武器。只要内容所有者观察网站访问者,就可以将该框架应用于个性化的广告放置,因此具有上下文,但需要将广告传输到负责放置营销内容的控制器。因此,速率受限的CMAB(RC-CMAB)问题需要研究有损压缩方案,以便每当对渠道速率的约束都不允许未经压缩的决策者意图传播时,就必须采用该策略。我们通过让代理人的数量转到无穷大,并研究了可以实现的遗憾,从而表征了该问题的基本信息理论限制,并分别确定导致线性和次线性遗憾的两个不同的速率区域。然后,当使用正向和反向kl差异作为失真度量时,我们分析了无限剂在极限上实现的最佳压缩方案。基于此,我们还提出了一个实用的编码方案,并提供数值结果。

We consider a rate-constrained contextual multi-armed bandit (RC-CMAB) problem, in which a group of agents are solving the same contextual multi-armed bandit (CMAB) problem. However, the contexts are observed by a remotely connected entity, i.e., the decision-maker, that updates the policy to maximize the returned rewards, and communicates the arms to be sampled by the agents to a controller over a rate-limited communications channel. This framework can be applied to personalized ad placement, whenever the content owner observes the website visitors, and hence has the context, but needs to transmit the ads to be shown to a controller that is in charge of placing the marketing content. Consequently, the rate-constrained CMAB (RC-CMAB) problem requires the study of lossy compression schemes for the policy to be employed whenever the constraint on the channel rate does not allow the uncompressed transmission of the decision-maker's intentions. We characterize the fundamental information theoretic limits of this problem by letting the number of agents go to infinity, and study the regret that can be achieved, identifying the two distinct rate regions leading to linear and sub-linear regrets respectively. We then analyze the optimal compression scheme achievable in the limit with infinite agents, when using the forward and reverse KL divergence as distortion metric. Based on this, we also propose a practical coding scheme, and provide numerical results.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源