论文标题

部分可观测时空混沌系统的无模型预测

Formally Verified Solution Methods for Infinite-Horizon Markov Decision Processes

论文作者

Schäfeller, Maximilian, Abdulaziz, Mohammad

论文摘要

我们正式验证用于求解Markov决策过程(MDP)的可执行算法(MDPS)。我们以现有的概率理论形式化为基础,以分析无限 - 摩恩问题问题的预期总奖励标准。我们的发展将钟声方程式形式化,并提供最佳政策的条件。基于此分析,我们验证动态编程算法来求解表格MDP。我们对标准问题进行了实验性验证的实现,并表明它们是实用的。此外,我们表明,与有效的未验证实现相结合,我们的系统可以与最先进的系统竞争甚至超过最先进的系统。

We formally verify executable algorithms for solving Markov decision processes (MDPs) in the interactive theorem prover Isabelle/HOL. We build on existing formalizations of probability theory to analyze the expected total reward criterion on infinite-horizon problems. Our developments formalize the Bellman equation and give conditions under which optimal policies exist. Based on this analysis, we verify dynamic programming algorithms to solve tabular MDPs. We evaluate the formally verified implementations experimentally on standard problems and show they are practical. Furthermore, we show that, combined with efficient unverified implementations, our system can compete with and even outperform state-of-the-art systems.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源