车辆路由的多机构神经改写器，费用有限

论文标题

车辆路由的多机构神经改写器，费用有限

Multi-Agent Neural Rewriter for Vehicle Routing with Limited Disclosure of Costs

论文作者

Paul, Nathalie, Wirtz, Tim, Wrobel, Stefan, Kister, Alexander

论文摘要

我们将解决多车程路由问题解释为马尔可夫的团队游戏，其成本部分可观察到。为了为给定的客户提供服务，游戏代理（车辆）的共同目标是确定最佳的总成本的团队最佳代理路线。因此，每个代理商仅观察自己的成本。我们的多项式强化学习方法，即所谓的多机神经重写者，建立在单格神经重写者的基础上，以通过迭代重写解决方案来解决该问题。并行代理操作执行和部分可观察性需要游戏的新重写规则。我们建议在系统中引入一个所谓的池，该池是未访问的节点的收集点。它使代理商能够同时采取行动并以无冲突的方式交换节点。我们仅在学习过程中仅分享特定于代理的成本的有限披露。在推断期间，每个代理人都会完全基于其自身成本。小问题大小的首先经验结果表明，我们达到的性能接近所采用的Or-Tools基准，该基准在完美的成本信息设置中运行。

We interpret solving the multi-vehicle routing problem as a team Markov game with partially observable costs. For a given set of customers to serve, the playing agents (vehicles) have the common goal to determine the team-optimal agent routes with minimal total cost. Each agent thereby observes only its own cost. Our multi-agent reinforcement learning approach, the so-called multi-agent Neural Rewriter, builds on the single-agent Neural Rewriter to solve the problem by iteratively rewriting solutions. Parallel agent action execution and partial observability require new rewriting rules for the game. We propose the introduction of a so-called pool in the system which serves as a collection point for unvisited nodes. It enables agents to act simultaneously and exchange nodes in a conflict-free manner. We realize limited disclosure of agent-specific costs by only sharing them during learning. During inference, each agents acts decentrally, solely based on its own cost. First empirical results on small problem sizes demonstrate that we reach a performance close to the employed OR-Tools benchmark which operates in the perfect cost information setting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题