量子策略梯度算法

论文标题

量子策略梯度算法

Quantum policy gradient algorithms

论文作者

Jerbi, Sofiene, Cornelissen, Arjan, Ozols, Māris, Dunjko, Vedran

论文摘要

了解机器学习任务中量子访问数据的功能和局限性是原始的，可以评估人工智能中量子计算的潜力。先前的工作已经表明，当量子访问加强学习环境时，学习中的加速是可能的。但是，在这种情况下，量子算法的适用性仍然非常有限，特别是在具有较大状态和动作空间的环境中。在这项工作中，我们设计了量子算法，以通过利用与环境的量子相互作用来训练最先进的增强学习政策。但是，这些算法仅在训练有素的政策满足某些规律性条件时，仅在其经典类似物上提供样品复杂性的完全二次加速度。有趣的是，我们发现，从参数化的量子电路中得出的强化学习政策在这些条件下表现得很好，这展示了完全量化的增强学习框架的好处。

Understanding the power and limitations of quantum access to data in machine learning tasks is primordial to assess the potential of quantum computing in artificial intelligence. Previous works have already shown that speed-ups in learning are possible when given quantum access to reinforcement learning environments. Yet, the applicability of quantum algorithms in this setting remains very limited, notably in environments with large state and action spaces. In this work, we design quantum algorithms to train state-of-the-art reinforcement learning policies by exploiting quantum interactions with an environment. However, these algorithms only offer full quadratic speed-ups in sample complexity over their classical analogs when the trained policies satisfy some regularity conditions. Interestingly, we find that reinforcement learning policies derived from parametrized quantum circuits are well-behaved with respect to these conditions, which showcases the benefit of a fully-quantum reinforcement learning framework.

下载PDF全文

下载文献需遵守相关版权规定

论文标题