论文标题

最佳的消除算法,用于学习最佳手臂

An Optimal Elimination Algorithm for Learning a Best Arm

论文作者

Hassidim, Avinatan, Kupfer, Ron, Singer, Yaron

论文摘要

我们考虑$(ε,δ)$ - PAC的经典问题,学习一个最佳手臂,其目标是置信度$ 1-δ$,其平均值为$ε$ - apptroximation to Multimed Bandit设置中平均值最高的手臂。这个问题是统计和学习理论中最根本的问题之一,但令人惊讶的是,其最糟糕的样本复杂性尚不清楚。在本文中,我们提出了一种新的方法(ε,δ)$ - PAC学习最好的手臂。这种方法导致了一种算法,其样品复杂性将其收敛到\ emph {}} $(ε,δ)$的最佳样品复杂性分别学习$ n $臂的平均值,我们通过有条件的匹配下限来补充这一结果。更具体地:

We consider the classic problem of $(ε,δ)$-PAC learning a best arm where the goal is to identify with confidence $1-δ$ an arm whose mean is an $ε$-approximation to that of the highest mean arm in a multi-armed bandit setting. This problem is one of the most fundamental problems in statistics and learning theory, yet somewhat surprisingly its worst-case sample complexity is not well understood. In this paper, we propose a new approach for $(ε,δ)$-PAC learning a best arm. This approach leads to an algorithm whose sample complexity converges to \emph{exactly} the optimal sample complexity of $(ε,δ)$-learning the mean of $n$ arms separately and we complement this result with a conditional matching lower bound. More specifically:

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源