使网络边缘的能源约束设备进行深入的强化学习

论文标题

使网络边缘的能源约束设备进行深入的强化学习

Enabling Deep Reinforcement Learning on Energy Constrained Devices at the Edge of the Network

论文作者

Hribar, Jernej, Dusparic, Ivana

论文摘要

深度强化学习（DRL）解决方案在网络边缘变得普遍，因为它们可以在动态环境中实现自主决策。但是，要能够适应不断变化的环境，即使在初次收敛后，在嵌入式设备上实现的DRL解决方案也必须偶尔采取探索性动作。换句话说，该设备必须偶尔采取随机操作并更新值功能，即重新培训人工神经网络（ANN），以确保其性能保持最佳。不幸的是，嵌入式设备通常缺乏训练ANN所需的处理能力和能量。当边缘设备仅由能量收集手段（EH）提供动力时，能量方面尤其具有挑战性。为了克服这个问题，我们提出了一种两部分的算法，其中DRL过程在水槽处进行了训练。然后，将完全训练的基础ANN的重量定期转移到EH驱动的嵌入式设备采取动作中。使用EH功率传感器，现实世界测量数据集并针对信息时代（AOI）度量进行优化，我们证明了这种DRL解决方案可以在性能中运行而不会降低任何降解，并且每天只需几个ANN更新。

Deep Reinforcement Learning (DRL) solutions are becoming pervasive at the edge of the network as they enable autonomous decision-making in a dynamic environment. However, to be able to adapt to the ever-changing environment, the DRL solution implemented on an embedded device has to continue to occasionally take exploratory actions even after initial convergence. In other words, the device has to occasionally take random actions and update the value function, i.e., re-train the Artificial Neural Network (ANN), to ensure its performance remains optimal. Unfortunately, embedded devices often lack processing power and energy required to train the ANN. The energy aspect is particularly challenging when the edge device is powered only by a means of Energy Harvesting (EH). To overcome this problem, we propose a two-part algorithm in which the DRL process is trained at the sink. Then the weights of the fully trained underlying ANN are periodically transferred to the EH-powered embedded device taking actions. Using an EH-powered sensor, real-world measurements dataset, and optimizing for Age of Information (AoI) metric, we demonstrate that such a DRL solution can operate without any degradation in the performance, with only a few ANN updates per day.

下载PDF全文

下载文献需遵守相关版权规定

论文标题