论文标题
自动搜索中的探索和剥削的多步双控制和融合保证的探索和剥削
Multi-step dual control for exploration and exploitation in autonomous search with convergence guarantee
论文作者
论文摘要
本文由最近提出的探索和剥削概念(DCEE)概念的双重控制,提出了一个多步DCEE(MS-DCEE)框架,并保证了对空气传播分散源的自主搜索的保证收敛。与现有的随机模型预测控制(SMPC)算法和信息性路径计划(IPP)方法不同,拟议的MS-DCEE方法使用当前和将来的输入,不仅可以通过积极学习操作环境来使代理商朝着估计的源位置(剥削)推向估计的源位置(开发)。未知的源目标位置以及未知环境,在建立递归可行性和所提出算法的收敛性方面构成了重大挑战。为了解决它们,借助贝叶斯估计的特性,我们开发了一种两步方法,首先假定平均估计的无偏见,然后考虑每个收集的信息序列下的平均估计值的随机性。基于此,我们开发了一种MS-DECEE方案,并具有合适的终端成分,可以保证递归可行性和收敛性。进行了两个模拟方案,这表明所提出的MS-DCEE算法在搜索成功率和效率方面优于SMPC,IPP和单步DCEE方法。
Motivated by the recently proposed dual control for exploration and exploitation (DCEE) concept, this paper presents a Multi-Step DCEE (MS-DCEE) framework with guaranteed convergence for autonomous search of a source of airborne dispersion. Different from the existing stochastic model predictive control (SMPC) algorithm and informative path planning (IPP) approaches, the proposed MS-DCEE approach uses the current and future input to not only drive the agent towards the estimated source location (exploitation) but also reduce its estimation uncertainty (exploration) by actively learning the operational environment. Unknown source target position, together with unknown environment, impose significant challenges in establishing the recursive feasibility and the convergence of the proposed algorithm. To address them, with the help of the property of Bayesian estimation, we develop a two-step approach where the unbiasedness of the mean estimation is assumed first and then the randomness of the mean estimate under each collected information sequence is accounted. Based on that, we develop a MS-DCEE scheme with suitable terminal ingredients where recursive feasibility and convergence are guaranteed. Two simulation scenarios are conducted, which show that the proposed MS-DCEE algorithm outperforms the SMPC, the IPP and the single-step DCEE approaches in terms of searching successful rates and efficiency.