论文标题
几乎最佳差异约束最佳臂识别
Almost Optimal Variance-Constrained Best Arm Identification
论文作者
论文摘要
我们设计和分析了一种无参数的算法VA-LUCB,用于识别固定信心设置下的最佳臂,并在严格的限制下,认为所选臂的方差严格小于给定阈值。 VA-LUCB样品复杂性上的上限显示出具有基本方差的硬度硬度数量$ H_ {VA} $的特征。通过证明一个下限,我们表明VA-LUCB的样本复杂性是最佳的,可以达到$ h_ {va} $中的因子对数。广泛的实验证实了样本复杂性对$ h_ {va} $中各种术语的依赖性。通过将VA-LUCB的经验表现与David等人的近距离竞争者的风险竞争者进行比较。 (2018年),我们的实验表明,对于这类风险受限的最佳手臂识别问题,VA-LUCB的样本复杂性最低,尤其是对于最风险的情况。
We design and analyze VA-LUCB, a parameter-free algorithm, for identifying the best arm under the fixed-confidence setup and under a stringent constraint that the variance of the chosen arm is strictly smaller than a given threshold. An upper bound on VA-LUCB's sample complexity is shown to be characterized by a fundamental variance-aware hardness quantity $H_{VA}$. By proving a lower bound, we show that sample complexity of VA-LUCB is optimal up to a factor logarithmic in $H_{VA}$. Extensive experiments corroborate the dependence of the sample complexity on the various terms in $H_{VA}$. By comparing VA-LUCB's empirical performance to a close competitor RiskAverse-UCB-BAI by David et al. (2018), our experiments suggest that VA-LUCB has the lowest sample complexity for this class of risk-constrained best arm identification problems, especially for the riskiest instances.