论文标题
连续时间随机系统的有限MDP的组成构建:一种耗散方法
Compositional Construction of Finite MDPs for Continuous-Time Stochastic Systems: A Dissipativity Approach
论文作者
论文摘要
本文提供了一种基于耗散性方法的组成方案,用于构建连续时间连续空间随机控制系统的有限抽象。所提出的框架享有互连拓扑的结构,并采用了随机存储功能的概念,这些概念描述了子系统及其抽象的关节消散性型特性。通过利用那些随机存储功能,可以在连续时间连续空间随机系统与其有限的对应物之间建立关系,同时量化其输出轨迹之间的概率距离。因此,人们可以在控制器设计过程中使用有限系统作为连续时间的合适替换,并保证误差绑定。在这方面,我们首先利用散发性型组成条件来量化连续时间连续空间随机系统的互连与它们的离散时间(有限或无限)抽象之间的距离之间的距离。然后,我们考虑一类特定的随机仿射系统,并构建其有限抽象以及相应的随机存储功能。通过将它们应用于包含100个房间的圆形网络中的温度调节,并在其原始连续时动力学中构建离散的时间抽象,从而证明了所提出结果的有效性。然后将构建的离散时间抽象用作构成合成策略的替代品,使每个房间的温度保持在舒适区。
This paper provides a compositional scheme based on dissipativity approaches for constructing finite abstractions of continuous-time continuous-space stochastic control systems. The proposed framework enjoys the structure of the interconnection topology and employs a notion of stochastic storage functions, that describe joint dissipativity-type properties of subsystems and their abstractions. By utilizing those stochastic storage functions, one can establish a relation between continuous-time continuous-space stochastic systems and their finite counterparts while quantifying probabilistic distances between their output trajectories. Consequently, one can employ the finite system as a suitable substitution of the continuous-time one in the controller design process with a guaranteed error bound. In this respect, we first leverage dissipativity-type compositional conditions for the compositional quantification of the distance between the interconnection of continuous-time continuous-space stochastic systems and that of their discrete-time (finite or infinite) abstractions. We then consider a specific class of stochastic affine systems and construct their finite abstractions together with their corresponding stochastic storage functions. The effectiveness of the proposed results is demonstrated by applying them to a temperature regulation in a circular network containing 100 rooms and compositionally constructing a discrete-time abstraction from its original continuous-time dynamic. The constructed discrete-time abstraction is then utilized as a substitute to compositionally synthesize policies keeping the temperature of each room in a comfort zone.