论文标题
马尔可夫决策过程是否适合数据:在顺序决策中对马尔可夫属性进行测试
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making
论文作者
论文摘要
马尔可夫假设(MA)是增强学习的经验有效性的基础。在本文中,我们提出了一种新颖的前卫学习程序,以在顺序决策中测试MA。所提出的测试在观察到的数据的联合分布上不假定任何参数形式,并且在识别高阶马尔可夫决策过程中的最佳策略和部分可观察到的MDP中起着重要作用。我们将测试应用于合成数据集和移动健康研究中的真实数据示例,以说明其有用性。
The Markov assumption (MA) is fundamental to the empirical validity of reinforcement learning. In this paper, we propose a novel Forward-Backward Learning procedure to test MA in sequential decision making. The proposed test does not assume any parametric form on the joint distribution of the observed data and plays an important role for identifying the optimal policy in high-order Markov decision processes and partially observable MDPs. We apply our test to both synthetic datasets and a real data example from mobile health studies to illustrate its usefulness.