SDN系统的联合开关控制器协会和控制权力：在线控制和在线学习的集成

论文标题

SDN系统的联合开关控制器协会和控制权力：在线控制和在线学习的集成

Joint Switch-Controller Association and Control Devolution for SDN Systems: An Integration of Online Control and Online Learning

论文作者

Huang, Xi, Tang, Yinxu, Shao, Ziyu, Yang, Yang, Xu, Hong

论文摘要

在软件定义的网络（SDN）系统中，采用多控制器设计和控制权力下放技术以提高控制平面的性能是一种常见的做法。但是，在这样的系统中，关节开关控制器关联和控制权的决策通常涉及各种不确定性，例如，控制器可访问性的时间变化以及开关的计算和通信成本。实际上，这种不确定性的统计数据是无法实现的，需要以在线方式学习，要求进行整合的学习和控制设计。在本文中，我们制定了一个随机网络优化问题，旨在最大程度地降低时间平均的系统成本并确保队列稳定性。通过将问题转换为具有长期稳定性约束的组合多臂匪徒问题，我们采用了强盗学习方法和最佳控制技术来处理勘探 - 开发折衷和长期稳定性约束。通过在线学习和在线控制的集成设计，我们提出了有效的学习辅助开关控制器协会和控制权力（LASAC）方案。我们的理论分析和仿真结果表明，拉萨克（LASAC）在队列稳定性和降低系统成本之间实现了可调节的权衡，并在有限的时间范围内通过sublrinear时空平均的遗憾束缚。

In software-defined networking (SDN) systems, it is a common practice to adopt a multi-controller design and control devolution techniques to improve the performance of the control plane. However, in such systems, the decision-making for joint switch-controller association and control devolution often involves various uncertainties, e.g., the temporal variations of controller accessibility, and computation and communication costs of switches. In practice, statistics of such uncertainties are unattainable and need to be learned in an online fashion, calling for an integrated design of learning and control. In this paper, we formulate a stochastic network optimization problem that aims to minimize time-average system costs and ensure queue stability. By transforming the problem into a combinatorial multi-armed bandit problem with long-term stability constraints, we adopt bandit learning methods and optimal control techniques to handle the exploration-exploitation tradeoff and long-term stability constraints, respectively. Through an integrated design of online learning and online control, we propose an effective Learning-Aided Switch-Controller Association and Control Devolution (LASAC) scheme. Our theoretical analysis and simulation results show that LASAC achieves a tunable tradeoff between queue stability and system cost reduction with a sublinear time-averaged regret bound over a finite time horizon.

下载PDF全文

下载文献需遵守相关版权规定

论文标题