在弹性分配系统中实时服务修复的混合模仿学习

论文标题

在弹性分配系统中实时服务修复的混合模仿学习

Hybrid Imitation Learning for Real-Time Service Restoration in Resilient Distribution Systems

论文作者

Zhang, Yichen, Qiu, Feng, Hong, Tianqi, Wang, Zhaoyu, Li, Fangxing

论文摘要

自我修复能力是弹性分配系统的最关键因素之一，它要求智能代理自动在线执行恢复性动作，包括网络重新配置和反应性功率调度。这些代理应配备预先设计的决策政策，以满足实时要求并处理高度复杂的$ N-K $方案。扰动随机性阻碍了探索主导算法（例如传统强化学习（RL））以及$ NK $方案下的代理培训问题（RL）的应用。在本文中，我们提出了模仿学习（IL）框架来培训此类政策，在该策略中，代理商将与专家进行互动以学习其最佳政策，因此与RL方法相比，训练效率显着提高了培训效率。为了同时处理领带的操作和反应性调度，我们为这种离散的连续混合动作空间设计了混合政策网络。我们在$ n-k $干扰下采用33节点系统来验证提议的框架。

Self-healing capability is one of the most critical factors for a resilient distribution system, which requires intelligent agents to automatically perform restorative actions online, including network reconfiguration and reactive power dispatch. These agents should be equipped with a predesigned decision policy to meet real-time requirements and handle highly complex $N-k$ scenarios. The disturbance randomness hampers the application of exploration-dominant algorithms like traditional reinforcement learning (RL), and the agent training problem under $N-k$ scenarios has not been thoroughly solved. In this paper, we propose the imitation learning (IL) framework to train such policies, where the agent will interact with an expert to learn its optimal policy, and therefore significantly improve the training efficiency compared with the RL methods. To handle tie-line operations and reactive power dispatch simultaneously, we design a hybrid policy network for such a discrete-continuous hybrid action space. We employ the 33-node system under $N-k$ disturbances to verify the proposed framework.

下载PDF全文

下载文献需遵守相关版权规定

论文标题