论文标题
异构种群的崇高控制和电力定价的应用
Ergodic control of a heterogeneous population and application to electricity pricing
论文作者
论文摘要
我们考虑了一个由能够在不同选项之间随时切换的代理组成的异质种群的控制问题。该控制器旨在最大化平均每个时间单位增益,认为人口是无限规模的。这导致了“平均场”马尔可夫决策过程的崇高控制问题,其中状态空间是简单的产物,并且人口根据受控的线性动力学发展。通过利用希尔伯特投影指标中动力学的收缩特性,我们证明了无限二二二维特征性的特征性的特征可以承认解决方案,并表明后者通常是非独特的。这使我们能够获得最佳策略,并量化稳态策略与最佳策略之间的差距。特别是,在一维情况下,我们证明存在循环政策 - 在折扣和利润阶段之间交替 - 比恒定价格策略获得了更大的增长。在数值方面,我们开发了一种具有“即时”生成的过渡的策略迭代算法,特别适合于可分解的模型,从而可节省大量内存。我们最终将结果应用于来自零售市场中遇到的电力定价问题的现实情况,并从数值上观察到环状促销的出现,以实现客户行为的足够惯性。
We consider a control problem for a heterogeneous population composed of agents able to switch at any time between different options. The controller aims to maximize an average gain per time unit, supposing that the population is of infinite size. This leads to an ergodic control problem for a "mean-field" Markov Decision Process in which the state space is a product of simplices, and the population evolves according to controlled linear dynamics. By exploiting contraction properties of the dynamics in Hilbert's projective metric, we prove that the infinite-dimensional ergodic eigenproblem admits a solution and show that the latter is in general non unique. This allows us to obtain optimal strategies, and to quantify the gap between steady-state strategies and optimal ones. In particular, we prove in the one-dimensional case that there exist cyclic policies -- alternating between discount and profit taking stages -- which secure a greater gain than constant-price policies. On numerical aspects, we develop a policy iteration algorithm with "on-the-fly" generated transitions, specifically adapted to decomposable models, leading to substantial memory savings. We finally apply our results on realistic instances coming from an electricity pricing problem encountered in the retail markets, and numerically observe the emergence of cyclic promotions for sufficient inertia in the customer behavior.