论文标题
Safepilco:一种用于安全和数据效率策略综合的软件工具
SafePILCO: a software tool for safe and data-efficient policy synthesis
论文作者
论文摘要
Safepilco是一种软件工具,可通过增强学习进行安全且具有数据效率的策略搜索。它扩展了最初用MATLAB编写的已知PILCO算法,以支持安全学习。我们提供Python实施并利用现有的库,使代码库保持短暂和模块化,这适用于验证,强化学习和控制社区的更广泛使用。
SafePILCO is a software tool for safe and data-efficient policy search with reinforcement learning. It extends the known PILCO algorithm, originally written in MATLAB, to support safe learning. We provide a Python implementation and leverage existing libraries that allow the codebase to remain short and modular, which is appropriate for wider use by the verification, reinforcement learning, and control communities.