动态内容缓存的深度强化学习方法

论文标题

动态内容缓存的深度强化学习方法

A Deep Reinforcement Learning Approach for Dynamic Contents Caching in HetNets

论文作者

Ma, Manyou, Wong, Vincent W. S.

论文摘要

物联网的最新发展需要缓存动态内容，其中新版本的内容遍布全天候，因此需要及时更新以确保其相关性。信息时代（AOI）是评估内容新鲜度的性能指标。关于高速缓存内容更新算法的AOI优度的现有作品着重于最大程度地减少所有缓存内容的长期平均AOI。有时，将来需要提供的用户请求会提前知道，并且可以存储在用户请求队列中。在本文中，我们建议动态缓存内容更新计划算法，以利用用户请求队列。我们考虑了一种特殊的用例，其中深度学习模型的训练有素的神经网络（NNS）被缓存在异质网络中。开发了基于马尔可夫决策过程（MDP）的队列感知的缓存内容更新计划算法，以最大程度地减少传递给用户的NNS的平均AOI以及与内容更新相关的成本。通过使用深度加固学习（DRL），我们提出了低复杂性次优度调度算法。仿真结果表明，在相同的更新频率下，我们提出的算法的表现优于周期性缓存内容更新方案，并将平均AOI降低了35％。

The recent development in Internet of Things necessitates caching of dynamic contents, where new versions of contents become available around-the-clock and thus timely update is required to ensure their relevance. The age of information (AoI) is a performance metric that evaluates the freshness of contents. Existing works on AoI-optimization of cache content update algorithms focus on minimizing the long-term average AoI of all cached contents. Sometimes user requests that need to be served in the future are known in advance and can be stored in user request queues. In this paper, we propose dynamic cache content update scheduling algorithms that exploit the user request queues. We consider a special use case where the trained neural networks (NNs) from deep learning models are being cached in a heterogeneous network. A queue-aware cache content update scheduling algorithm based on Markov decision process (MDP) is developed to minimize the average AoI of the NNs delivered to the users plus the cost related to content updating. By using deep reinforcement learning (DRL), we propose a low complexity suboptimal scheduling algorithm. Simulation results show that, under the same update frequency, our proposed algorithms outperform the periodic cache content update scheme and reduce the average AoI by up to 35%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题