通过多发出深入强化学习解决动态图形问题

论文标题

通过多发出深入强化学习解决动态图形问题

Solving Dynamic Graph Problems with Multi-Attention Deep Reinforcement Learning

论文作者

Gunarathna, Udesh, Borovica-Gajic, Renata, Karunasekara, Shanika, Tanin, Egemen

论文摘要

诸如旅行推销员问题或发现最小的斯坦纳树之类的图形问题被广泛研究并用于数据工程和计算机科学。通常，在现实世界中，图表的特征往往会随着时间的流逝而变化，因此，找到解决问题的解决方案变得具有挑战性。许多图形问题的动态版本是运输，电信和社交网络中众多现实世界问题的关键。近年来，使用深度学习技术来找到NP-HARD图组合问题的启发式解决方案引起了极大的兴趣，因为这些学到的启发式方法可以有效地找到近乎最佳的解决方案。但是，大多数学习启发式方法的现有方法都集中在静态图问题上。动态性质使NP-HARD图问题的学习更具挑战性，并且现有方法无法找到合理的解决方案。在本文中，我们提出了一种新颖的结构，并通过增强学习（GTA-RL）来学习基于图的动态组合优化问题的启发式解决方案。 GTA-RL架构由一个编码器组成，能够嵌入组合问题实例的时间特征和能够动态关注嵌入式功能的解码器，以找到针对给定组合问题实例的解决方案。然后，我们将架构扩展以了解组合优化问题的实时版本的启发式方法，在这些问题中，问题的所有输入特征尚不清楚，而是实时学习。我们针对几种基于学习的算法和最佳求解器的实验结果表明，我们的方法在有效性和最佳求解器方面优于最先进的学习方法，而在动态和实时图形组合组合优化方面，我们的方法优于最佳求解器。

Graph problems such as traveling salesman problem, or finding minimal Steiner trees are widely studied and used in data engineering and computer science. Typically, in real-world applications, the features of the graph tend to change over time, thus, finding a solution to the problem becomes challenging. The dynamic version of many graph problems are the key for a plethora of real-world problems in transportation, telecommunication, and social networks. In recent years, using deep learning techniques to find heuristic solutions for NP-hard graph combinatorial problems has gained much interest as these learned heuristics can find near-optimal solutions efficiently. However, most of the existing methods for learning heuristics focus on static graph problems. The dynamic nature makes NP-hard graph problems much more challenging to learn, and the existing methods fail to find reasonable solutions. In this paper, we propose a novel architecture named Graph Temporal Attention with Reinforcement Learning (GTA-RL) to learn heuristic solutions for graph-based dynamic combinatorial optimization problems. The GTA-RL architecture consists of an encoder capable of embedding temporal features of a combinatorial problem instance and a decoder capable of dynamically focusing on the embedded features to find a solution to a given combinatorial problem instance. We then extend our architecture to learn heuristics for the real-time version of combinatorial optimization problems where all input features of a problem are not known a prior, but rather learned in real-time. Our experimental results against several state-of-the-art learning-based algorithms and optimal solvers demonstrate that our approach outperforms the state-of-the-art learning-based approaches in terms of effectiveness and optimal solvers in terms of efficiency on dynamic and real-time graph combinatorial optimization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题