论文标题

关于顺序激励设计的复杂性

On the Complexity of Sequential Incentive Design

论文作者

Savas, Yagiz, Gupta, Vijay, Topcu, Ufuk

论文摘要

在许多情况下,主体会与代理人动态交互,并提供一系列激励措施,以使代理人的行为与所需的目标保持一致。本文着重于综合激励序列的问题,即即使代理人的内在动机不知道本金,该序列一旦提供,也会引起所需的代理行为。我们将代理商的行为塑造为马尔可夫决策过程,将其内在动机作为奖励功能,属于有限的奖励功能,并将激励措施视为提供给代理商的额外奖励。我们首先表明行为修改问题(BMP),即合成激励序列的问题,该序列是pspace-hard,该序列以最低的总成本诱导了所需的代理行为。此外,我们表明,通过对原理可用的激励序列施加一定的限制,可以获得BMP的两个NP完整变体。我们还为可​​以通过线性编程解决BMP的一组可能的奖励功能提供了足够的条件。最后,我们提出了两种算法,以对BMP的NP完整变体进行全球和局部最佳的解决方案计算。

In many scenarios, a principal dynamically interacts with an agent and offers a sequence of incentives to align the agent's behavior with a desired objective. This paper focuses on the problem of synthesizing an incentive sequence that, once offered, induces the desired agent behavior even when the agent's intrinsic motivation is unknown to the principal. We model the agent's behavior as a Markov decision process, express its intrinsic motivation as a reward function, which belongs to a finite set of possible reward functions, and consider the incentives as additional rewards offered to the agent. We first show that the behavior modification problem (BMP), i.e., the problem of synthesizing an incentive sequence that induces a desired agent behavior at minimum total cost to the principal, is PSPACE-hard. Moreover, we show that by imposing certain restrictions on the incentive sequences available to the principal, one can obtain two NP-complete variants of the BMP. We also provide a sufficient condition on the set of possible reward functions under which the BMP can be solved via linear programming. Finally, we propose two algorithms to compute globally and locally optimal solutions to the NP-complete variants of the BMP.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源