使用加固学习代理人动态编写耦合记忆，达到物理界限

论文标题

使用加固学习代理人动态编写耦合记忆，达到物理界限

Dynamically writing coupled memories using a reinforcement learning agent, meeting physical bounds

论文作者

Jules, Théo, Michel, Laura, Douin, Adèle, Lechenault, Frédéric

论文摘要

传统的记忆写作操作一次进行，例如单个磁性域被局部外部磁场填充。提高材料存储能力的一种方法是一次在大部分材料中写几个位。但是，通常通过准静态操作进行操作。虽然简单地建模，但已知该方法可降低记忆能力。在本文中，我们演示了加固学习代理如何利用简单的多位机械系统的动态响应来将其内存恢复到满负荷。为此，我们介绍了一个模型框架，该模型框架由双稳定弹簧链组成，该弹簧由代理的外部动作在一端操纵。我们表明，该代理商设法学习了如何通过绝热操作到达三个弹簧的所有可用状态，即使某些状态无法达到，并且使用传输学习技术改善了物理参数空间内的训练速度和收敛。有趣的是，代理在写作时间方面还指出了系统的最佳设计。实际上，它似乎学习了如何利用基本物理学的优势：控制时间表现出对内部耗散的非单调依赖性，在显示的交叉上达到最低限度，以验证机械动机的缩放关系。

Traditional memory writing operations proceed one bit at a time, where e.g. an individual magnetic domain is force-flipped by a localized external field. One way to increase material storage capacity would be to write several bits at a time in the bulk of the material. However, the manipulation of bits is commonly done through quasi-static operations. While simple to model, this method is known to reduce memory capacity. In this paper, we demonstrate how a reinforcement learning agent can exploit the dynamical response of a simple multi-bit mechanical system to restore its memory to full capacity. To do so, we introduce a model framework consisting of a chain of bi-stable springs, which is manipulated on one end by the external action of the agent. We show that the agent manages to learn how to reach all available states for three springs, even though some states are not reachable through adiabatic manipulation, and that both the training speed and convergence within physical parameter space are improved using transfer learning techniques. Interestingly, the agent also points to an optimal design of the system in terms of writing time. In fact, it appears to learn how to take advantage of the underlying physics: the control time exhibits a non-monotonic dependence on the internal dissipation, reaching a minimum at a cross-over shown to verify a mechanically motivated scaling relation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题