论文标题
组合分配中组的优先级排序
Ranked Prioritization of Groups in Combinatorial Bandit Allocation
论文作者
论文摘要
防止游侠巡逻队偷猎可保护濒临灭绝的野生动植物,这直接促进了联合国可持续发展目标15土地上的生命目标。组合匪徒已被用来分配有限的巡逻资源,但是现有的方法忽略了以下事实:每个位置都以不同比例为单位,因此巡逻队在不同程度上受益于每个物种。当某些物种更脆弱时,我们应该为这些动物提供更多保护。不幸的是,现有的组合匪徒方法没有提供优先级的重要物种的方法。为了弥合这一差距,(1)我们提出了一个新颖的组合匪徒目标,该目标在奖励最大化之间进行了交易,并且还占了优先级的物种优先级,我们称这是排名的优先级。我们表明,该目标可以表示为LIPSCHITZ连续奖励功能的加权线性总和。 (2)我们提供了排名cucb,这是一种选择组合作用的算法,以优化我们的基于优先级的目标,并证明其实现了渐近的NO-REGRET。 (3)我们从经验上证明,使用现实世界中的野生动植物保护数据,对濒危物种的结局提高了38%。除了适应其他挑战,例如防止非法伐木和过度捕捞之外,我们的无regret算法还通过加权线性目标解决了一般组合匪徒匪徒问题。
Preventing poaching through ranger patrols protects endangered wildlife, directly contributing to the UN Sustainable Development Goal 15 of life on land. Combinatorial bandits have been used to allocate limited patrol resources, but existing approaches overlook the fact that each location is home to multiple species in varying proportions, so a patrol benefits each species to differing degrees. When some species are more vulnerable, we ought to offer more protection to these animals; unfortunately, existing combinatorial bandit approaches do not offer a way to prioritize important species. To bridge this gap, (1) We propose a novel combinatorial bandit objective that trades off between reward maximization and also accounts for prioritization over species, which we call ranked prioritization. We show this objective can be expressed as a weighted linear sum of Lipschitz-continuous reward functions. (2) We provide RankedCUCB, an algorithm to select combinatorial actions that optimize our prioritization-based objective, and prove that it achieves asymptotic no-regret. (3) We demonstrate empirically that RankedCUCB leads to up to 38% improvement in outcomes for endangered species using real-world wildlife conservation data. Along with adapting to other challenges such as preventing illegal logging and overfishing, our no-regret algorithm addresses the general combinatorial bandit problem with a weighted linear objective.