论文标题
递归的两步lookahead预期的贝叶斯优化的预期收益
Recursive Two-Step Lookahead Expected Payoff for Time-Dependent Bayesian Optimization
论文作者
论文摘要
我们提出了一种新颖的贝叶斯方法,以解决时间依赖于时间昂贵的甲骨文的最大化。我们对在有限的时间范围内最大化Oracle的决定感兴趣,而在范围之前进行相对较少的嘈杂评估。我们的递归,两步的lookahead预期收益($ \ texttt {r2ley} $)的采集功能在每个阶段都可以通过最大化甲骨文在地平线上的预期预期值来做出非近乎的决策。 $ \ texttt {r2ley} $通过递归优化在每个阶段的两步lookahead获取函数来绕过昂贵的多步(超过两个步骤)的评估;将后一种功能及其梯度的无偏估计器用于有效优化。 $ \ texttt {r2ley} $显示出远离时间范围的自然探索属性,从而能够准确地模仿甲骨文,这在地平线的最终决定中被利用。为了演示$ \ texttt {r2ley} $的实用性,我们将其与通过合成和现实世界数据集的流行近视采集功能的时间相关扩展进行了比较。
We propose a novel Bayesian method to solve the maximization of a time-dependent expensive-to-evaluate oracle. We are interested in the decision that maximizes the oracle at a finite time horizon, when relatively few noisy evaluations can be performed before the horizon. Our recursive, two-step lookahead expected payoff ($\texttt{r2LEY}$) acquisition function makes nonmyopic decisions at every stage by maximizing the estimated expected value of the oracle at the horizon. $\texttt{r2LEY}$ circumvents the evaluation of the expensive multistep (more than two steps) lookahead acquisition function by recursively optimizing a two-step lookahead acquisition function at every stage; unbiased estimators of this latter function and its gradient are utilized for efficient optimization. $\texttt{r2LEY}$ is shown to exhibit natural exploration properties far from the time horizon, enabling accurate emulation of the oracle, which is exploited in the final decision made at the horizon. To demonstrate the utility of $\texttt{r2LEY}$, we compare it with time-dependent extensions of popular myopic acquisition functions via both synthetic and real-world datasets.