Data traffic congestion management for data center ethernet.
Date of Issue2013
School of Computer Engineering
Ten-gigabit Ethernet was first standardized in 2002. Over the years, the success standardization has led to wide acceptance and commercialization in the industry. Ethernet has offered an attractive solution for data center to consolidate heterogeneous traffic into a single fabric. A data center usually consists of three overlapped networks, Storage Area Network (SAN), Local Area Network (LAN) and High Performance Computing (HPC) Network. To satisfy their specialized requirements, each of them transmits traffic through dedicated fabrics. Replacing dedicated switches, transceivers and adapters with Ethernet components, the consolidation reduces both capi- tal expenditure (CAPEX) and operational expenditure (OPEX). Although Ethernet exhibits many advantages, one must admit that it remains inad- equate due to the lack of congestion management support. In this thesis, an Ethernet congestion management solution is proposed to satisfy require- ments of loss-free, high throughput, and preferably, minimal energy usage for data center networks. Conventional Ethernet only offers best effort delivery which tolerates frame drops, whereas SAN/HPC traffic is sensitive to loss. The consolidation solution should control congestion to ensure SAN/HPC traffic free from loss with higher priorities. Moreover, Ethernet uses Spanning Tree Proto- col (STP) to prevent loops which prunes a mesh topology to a single tree structure eliminating redundancies and reducing network throughput. To preserve high bisectional bandwidth, the solution should enhance Ether- net with multipath feature to ease congestions. In addition, utilization of multipath, though increases network throughput, exerts extra power con- sumption by distributing traffic amongst all the paths when there is limited volume of traffic. The tradeoff of load balance or aggregation becomesanother research topic for adapting congestion with economizing data cen- ter cost. Thus a comprehensive Ethernet congestion management solution should satisfy three requirements, namely prioritize congestion control, high throughput congestion easing and energy efficient congestion adaptation. However, no systematic study has been found of such an approach. This research is conducted to fill the gap of existing approaches and ex- plore the possibility of an enhanced Ethernet congestion management for data centers. Firstly, a prioritized end-to-end congestion control scheme is proposed to combat congestion and protect traffic with higher through- put requirements, by introducing Active Queue Management (AQM) and Addictive Increase and Multiplicative Decrease (AIMD) to detect and con- strain source rates towards congestions. Systematic analytical work is also presented to study suitability of the proposal and recommendation of sys- tem parameters. Secondly, an in-depth study is conducted to investigate on multipath routing, where a central controller overviews network status and controls generation and distribution of traffic. This two-tiered study not only effectively eases congestion by a fine grained load balance but also throttles excessive traffic from entering network core. Finally, being aware that load balance among multiple paths is at odds with efficient energy usage, the relationship between power consumption and number of active links is studied and found to be in a linear order. A proposal for energy optimization that dynamically balances load and adapts link rate with con- gestions is then presented to achieve minimal power consumption without compromising on throughput. All these three studies are validated by ex- tensive simulations using OMNET++ simulator, which demonstrates the favorable performance of these schemes proposed. In summary, we propose prioritized congestion control that achieves ratio differentiation and combats congestion, loss-free multipath routing that distributes load almost equally to ease congestion and maximize network throughput, and efficient energy optimization that adapts load distribution and aggregation with congestion state. The congestion management solution proposed has achieved the ob- jectives of the study with 3.5 SAN/LAN ratio, two-folded traffic volume support and up to 60% power savings.