组合优化的加固学习调查

论文标题

组合优化的加固学习调查

A Survey on Reinforcement Learning for Combinatorial Optimization

论文作者

Yang, Yunhao, Whinston, Andrew

论文摘要

本文对组合优化的加固学习（RL）进行了详细的综述，介绍了从1950年代开始的组合优化历史，并将其与近年来的RL算法进行了比较。本文明确研究了一个著名的组合问题旅行销售人员问题（TSP）。它比较了TSP现代RL算法的方法与1970年代发表的方法。通过比较这些方法学之间的相似性和差异，本文演示了由于机器学习技术和计算能力的演变，如何优化RL算法。然后，该论文简要介绍了名为Deep RL的TSP的深度学习方法，这是传统数学框架的扩展。在深度RL中，引入了注意力和特征编码机制，以生成近乎最佳的解决方案。调查表明，将深度学习机制（例如与RL的注意）相结合可以有效地近似TSP。本文还认为，深度学习可能是一种通用方法，可以与任何传统的RL算法集成以增强TSP的结果。

This paper gives a detailed review of reinforcement learning (RL) in combinatorial optimization, introduces the history of combinatorial optimization starting in the 1950s, and compares it with the RL algorithms of recent years. This paper explicitly looks at a famous combinatorial problem-traveling salesperson problem (TSP). It compares the approach of modern RL algorithms for the TSP with an approach published in the 1970s. By comparing the similarities and variances between these methodologies, the paper demonstrates how RL algorithms are optimized due to the evolution of machine learning techniques and computing power. The paper then briefly introduces the deep learning approach to the TSP named deep RL, which is an extension of the traditional mathematical framework. In deep RL, attention and feature encoding mechanisms are introduced to generate near-optimal solutions. The survey shows that integrating the deep learning mechanism, such as attention with RL, can effectively approximate the TSP. The paper also argues that deep learning could be a generic approach that can be integrated with any traditional RL algorithm to enhance the outcomes of the TSP.

下载PDF全文

下载文献需遵守相关版权规定

论文标题