论文标题
碳化物:通过实时多重控制平面组成的高度可靠网络
Carbide: Highly Reliable Networks Through Real-Time Multiple Control Plane Composition
论文作者
论文摘要
实现高度可靠的网络对于网络运营商必须确保在软件错误或硬件故障的情况下确保数据包交付。网络必须确保可及性和路由正确性,例如子网隔离和航向点遍历。网络验证中的现有工作依赖于集中的计算,而其他方法则构建了过度工程,复杂的控制平面,或者构成多个控制平面而不提供任何正确性的保证。本文提出了碳化物,这是一种新型系统,可通过分布式验证和多个控制平面组成在网络中获得高可靠性。碳化物的核心是一个简单,通用,高效的分布式验证框架,将通用网络验证问题转换为有向无环图(DAG)上的可及性验证问题,并通过有效的分布式验证协议(DV-Protocol)解决后者。配备了验证结果,碳化物允许系统组成多个控制平面并实现操作员指定的一致性。碳化物已完全实施。广泛的实验表明,(1)碳化物比最可靠的个体控制平面将停机时间降低了43%,同时对所有流量执行了正确的要求; (2)通过系统地将计算分解为设备,并在验证期间将设备之间的不必要的消息传递到生产数据中心网络。
Achieving highly reliable networks is essential for network operators to ensure proper packet delivery in the event of software errors or hardware failures. Networks must ensure reachability and routing correctness, such as subnet isolation and waypoint traversal. Existing work in network verification relies on centralized computation at the cost of fault tolerance, while other approaches either build an over-engineered, complex control plane, or compose multiple control planes without providing any guarantee on correctness. This paper presents Carbide, a novel system to achieve high reliability in networks through distributed verification and multiple control plane composition. The core of Carbide is a simple, generic, efficient distributed verification framework that transforms a generic network verification problem to a reachability verification problem on a directed acyclic graph (DAG), and solves the latter via an efficient distributed verification protocol (DV-protocol). Equipped with verification results, Carbide allows the systematic composition of multiple control planes and realization of operator-specified consistency. Carbide is fully implemented. Extensive experiments show that (1) Carbide reduces downtime by 43% over the most reliable individual underlying control plane, while enforcing correctness requirements on all traffic; and (2) by systematically decomposing computation to devices and pruning unnecessary messaging between devices during verification, Carbide scales to a production data center network.