战略排队系统中的稳定性和学习

论文标题

战略排队系统中的稳定性和学习

Stability and Learning in Strategic Queuing Systems

论文作者

Gaitonde, Jason, Tardos, Eva

论文摘要

限制无政府状态的价格，这是一个重要的研究领域。在本文中，我们在游戏建模排队系统的背景下研究了这种现象：路由器竞争服务器，其中未获得服务的数据包将在以后的回合中感到不满，从而导致每轮数据包数量取决于前一轮路由器的成功。我们将其建模为（无限）重复的游戏，在该游戏中，系统保留了从上一轮的结果产生的状态（每个队列持有的数据包）。我们假设路由器满足了无纤维条件，例如他们使用学习策略来确定数据包获得最佳服务的服务器。重复游戏的经典作品是一个有力的假设，即随后的重复游戏是独立的（超出了过去历史学习的影响）。该系统中保留的数据包引起的结转效应使我们的上下文学习导致了高度依赖的随机过程。我们分析了这一随机过程，并发现服务器的容量足够高，即使将所有数据包允许允许将所有数据包提供给数据包到达率，并且队列使用不需要的rebret学习算法，那么假设较旧的数据包将保持限制，那么排名前的预期数据包将保持界限。本文是第一个在排队系统中研究自私学习的效果的论文，在该系统中，学习者争夺资源，但回合并非全部独立：在每轮比赛中要路由的数据包数量取决于前一轮路由器的成功。

Bounding the price of anarchy, which quantifies the damage to social welfare due to selfish behavior of the participants, has been an important area of research. In this paper, we study this phenomenon in the context of a game modeling queuing systems: routers compete for servers, where packets that do not get service will be resent at future rounds, resulting in a system where the number of packets at each round depends on the success of the routers in the previous rounds. We model this as an (infinitely) repeated game, where the system holds a state (number of packets held by each queue) that arises from the results of the previous round. We assume that routers satisfy the no-regret condition, e.g. they use learning strategies to identify the server where their packets get the best service. Classical work on repeated games makes the strong assumption that the subsequent rounds of the repeated games are independent (beyond the influence on learning from past history). The carryover effect caused by packets remaining in this system makes learning in our context result in a highly dependent random process. We analyze this random process and find that if the capacity of the servers is high enough to allow a centralized and knowledgeable scheduler to get all packets served even with double the packet arrival rate, and queues use no-regret learning algorithms, then the expected number of packets in the queues will remain bounded throughout time, assuming older packets have priority. This paper is the first to study the effect of selfish learning in a queuing system, where the learners compete for resources, but rounds are not all independent: the number of packets to be routed at each round depends on the success of the routers in the previous rounds.

下载PDF全文

下载文献需遵守相关版权规定

论文标题