最佳频率控制的稳定增强学习：一种基于平均的积分方法

论文标题

最佳频率控制的稳定增强学习：一种基于平均的积分方法

Stable Reinforcement Learning for Optimal Frequency Control: A Distributed Averaging-Based Integral Approach

论文作者

Jiang, Yan, Cui, Wenqi, Zhang, Baosen, Cortés, Jorge

论文摘要

频率控制在可靠的电力系统操作中起关键作用。它是以层次结构的方式进行的，该方式首先迅速稳定频率偏差，然后慢慢恢复名义频率。但是，随着一代混合物从同步发电机转变为可再生资源，由于惯性的损失，电力系统经历了更大，更快的频率波动，这会对频率稳定性产生不利影响。这激发了在算法中的积极研究，这些研究共同解决了快速时间范围内的频率下降和经济效率，其中分布式平均积分（DAI）控制是一个值得注意的基于平均的积分（DAI）控制，可将可控制的功率注射直接与频率偏差和经济效率低效率信号成正比。但是，DAI通常不考虑功率干扰后系统的瞬态性能，并且仅限于二次操作成本功能。该手稿旨在利用非线性最佳控制器同时实现最佳的瞬态频率控制，并找到用于频率恢复的最经济能力。为此，我们将加固学习（RL）集成到经典的DAI，从而导致RL-DAI。具体来说，我们使用RL来学习从DAI的积分变量到可控功率注射的基于神经网络的控制策略映射，该映射提供了最佳的瞬态频率控制，而DAI固有地确保了频率恢复和最佳的经济调度。与现有方法相比，我们为学习控制器的稳定性提供了可证明的保证，并将允许的成本功能扩展到更大的类别。对39个布斯新英格兰系统的模拟说明了我们的结果。

Frequency control plays a pivotal role in reliable power system operations. It is conventionally performed in a hierarchical way that first rapidly stabilizes the frequency deviations and then slowly recovers the nominal frequency. However, as the generation mix shifts from synchronous generators to renewable resources, power systems experience larger and faster frequency fluctuations due to the loss of inertia, which adversely impacts the frequency stability. This has motivated active research in algorithms that jointly address frequency degradation and economic efficiency in a fast timescale, among which the distributed averaging-based integral (DAI) control is a notable one that sets controllable power injections directly proportional to the integrals of frequency deviation and economic inefficiency signals. Nevertheless, DAI do not typically consider the transient performance of the system following power disturbances and has been restricted to quadratic operational cost functions. This manuscript aims to leverage nonlinear optimal controllers to simultaneously achieve optimal transient frequency control and find the most economic power dispatch for frequency restoration. To this end, we integrate reinforcement learning (RL) to the classic DAI, which results in RL-DAI. Specifically, we use RL to learn a neural network-based control policy mapping from the integral variables of DAI to the controllable power injections which provides optimal transient frequency control, while DAI inherently ensures the frequency restoration and optimal economic dispatch. Compared to existing methods, we provide provable guarantees on the stability of the learned controllers and extend allowable cost functions to a much larger class. Simulations on the 39-bus New England system illustrate our results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题