在线竞标算法，用于返回元素受限广告商

论文标题

在线竞标算法，用于返回元素受限广告商

Online Bidding Algorithms for Return-on-Spend Constrained Advertisers

论文作者

Feng, Zhe, Padmanabhan, Swati, Wang, Di

论文摘要

在线广告最近已发展成为一个竞争激烈且复杂的数十亿美元行业，广告商在大型和高频上竞标广告插槽。这导致对有效的“自动铸造”算法的需求日益增长，这些算法确定了传入查询的出价，以最大程度地提高广告商的目标，但受其指定的约束。这项工作探索了在日益流行的约束下，探索单个价值最大化广告商的有效在线算法：返回式播种（ROS）。相对于最佳算法，我们对遗憾进行了量化效率，该算法知道所有查询所有查询都是先验的。我们贡献了一种简单的在线算法，该算法在期望中实现了近乎理想的遗憾，同时始终尊重指定的ROS约束，当查询的输入顺序为I.I.D.来自某些分布的样品。我们还将结果与Balseiro，Lu和Mirrokni [BLM20]的先前工作相结合，以实现近乎最佳的遗憾，同时尊重ROS和固定的预算限制。我们的算法遵循原始的二次框架，并使用在线镜像下降（OMD）进行双重更新。但是，我们需要使用非典型的OMD设置，因此需要使用OMD的经典低rebret保证，这是在线学习中的对抗环境，不再具有。尽管如此，在我们的情况下，在更普遍的情况下，在算法设计中应用低重质动力学的情况下，OMD遇到的梯度可能远非对抗性，但受我们的算法选择的影响。我们利用这一关键见解，以显示我们的OMD设置在我们的算法领域中造成了较低的遗憾。

Online advertising has recently grown into a highly competitive and complex multi-billion-dollar industry, with advertisers bidding for ad slots at large scales and high frequencies. This has resulted in a growing need for efficient "auto-bidding" algorithms that determine the bids for incoming queries to maximize advertisers' targets subject to their specified constraints. This work explores efficient online algorithms for a single value-maximizing advertiser under an increasingly popular constraint: Return-on-Spend (RoS). We quantify efficiency in terms of regret relative to the optimal algorithm, which knows all queries a priori. We contribute a simple online algorithm that achieves near-optimal regret in expectation while always respecting the specified RoS constraint when the input sequence of queries are i.i.d. samples from some distribution. We also integrate our results with the previous work of Balseiro, Lu, and Mirrokni [BLM20] to achieve near-optimal regret while respecting both RoS and fixed budget constraints. Our algorithm follows the primal-dual framework and uses online mirror descent (OMD) for the dual updates. However, we need to use a non-canonical setup of OMD, and therefore the classic low-regret guarantee of OMD, which is for the adversarial setting in online learning, no longer holds. Nonetheless, in our case and more generally where low-regret dynamics are applied in algorithm design, the gradients encountered by OMD can be far from adversarial but influenced by our algorithmic choices. We exploit this key insight to show our OMD setup achieves low regret in the realm of our algorithm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题