论文标题
对样品有效增强学习的2019年矿机竞争的回顾性分析
Retrospective Analysis of the 2019 MineRL Competition on Sample Efficient Reinforcement Learning
论文作者
论文摘要
为了促进样本有效的增强学习方向的研究,我们在第三十三届神经信息处理系统会议上使用人类先验进行了矿机竞争,以进行样品有效的增强学习(Neurips 2019)。这项竞争的主要目标是促进使用人类示范以及强化学习的算法的发展,以减少解决复杂,分层和稀疏环境所需的样本数量。我们描述了竞争,概述了我们提供给参与者的主要挑战,竞争设计以及我们提供的资源。我们提供了顶级解决方案的概述,每种解决方案都使用深度强化学习和/或模仿学习。我们还讨论了我们的组织决策对竞争和未来改进方向的影响。
To facilitate research in the direction of sample efficient reinforcement learning, we held the MineRL Competition on Sample Efficient Reinforcement Learning Using Human Priors at the Thirty-third Conference on Neural Information Processing Systems (NeurIPS 2019). The primary goal of this competition was to promote the development of algorithms that use human demonstrations alongside reinforcement learning to reduce the number of samples needed to solve complex, hierarchical, and sparse environments. We describe the competition, outlining the primary challenge, the competition design, and the resources that we provided to the participants. We provide an overview of the top solutions, each of which use deep reinforcement learning and/or imitation learning. We also discuss the impact of our organizational decisions on the competition and future directions for improvement.