针对对手的自适应离散化：Lipschitz强盗，动态定价和拍卖调整

论文标题

针对对手的自适应离散化：Lipschitz强盗，动态定价和拍卖调整

Adaptive Discretization against an Adversary: Lipschitz bandits, Dynamic Pricing, and Auction Tuning

论文作者

Podimata, Chara, Slivkins, Aleksandrs

论文摘要

Lipschitz Bandits是多军匪徒的重要版本，研究大型，结构化的动作空间，例如$ [0,1] $ Interval，其中保证类似的动作具有相似的奖励。这里的一个中心主题是动作空间的自适应离散化，该空间逐渐``放大''在其更有希望的区域上。目的是利用``更好''的问题实例，同时保持近乎最佳的最差表现。尽管该问题的随机版本是充分理解的，但具有对抗奖励的一般版本却不是。我们提供第一种算法（\ emph {对抗性缩放}），以在对抗版本中自适应离散化，并得出了与实例有关的遗憾界限。特别是，我们恢复了针对对抗版本的最佳最佳遗憾，并为随机版本绑定了实例依赖的后悔。我们将算法应用于几种基本应用程序（包括动态定价和拍卖储备调整），所有这些都在对抗奖励模型下。尽管这些域通常违反Lipschitzness，但我们的分析仅需要较弱的版本，可以在没有其他平滑度假设的情况下进行有意义的遗憾界限。值得注意的是，我们将结果扩展到具有非平滑奖励结构的多产品动态定价，这种设置甚至无法满足单方面的Lipschitzness。

Lipschitz bandits is a prominent version of multi-armed bandits that studies large, structured action spaces such as the $[0,1]$ interval, where similar actions are guaranteed to have similar rewards. A central theme here is the adaptive discretization of the action space, which gradually ``zooms in'' on the more promising regions thereof. The goal is to take advantage of ``nicer'' problem instances, while retaining near-optimal worst-case performance. While the stochastic version of the problem is well-understood, the general version with adversarial rewards is not. We provide the first algorithm (\emph{Adversarial Zooming}) for adaptive discretization in the adversarial version, and derive instance-dependent regret bounds. In particular, we recover the worst-case optimal regret bound for the adversarial version, and the instance-dependent regret bound for the stochastic version. We apply our algorithm to several fundamental applications -- including dynamic pricing and auction reserve tuning -- all under adversarial reward models. While these domains often violate Lipschitzness, our analysis only requires a weaker version thereof, allowing for meaningful regret bounds without additional smoothness assumptions. Notably, we extend our results to multi-product dynamic pricing with non-smooth reward structures, a setting which does not even satisfy one-sided Lipschitzness.

下载PDF全文

下载文献需遵守相关版权规定

论文标题