Peeling back the onion on HP-FEX
http://virtualeverything.wordpress.com/2011/10/24/peeling-back-the-onion-on-hp-fex/
Recently, HP and Cisco in collaboration released a FEX module for the HP C7000 chassis. See here and here to read about the release from both HP and Cisco’s perspective. This post is not to discuss the business decisions behind this product release, but rather to take a closer look at the HP-FEX architecture from a technology perspective.
First off all, what the heck is a FEX? Read here and here for some background on the term.
Now, with that out of the way, lets take a look at the networking architecture when deploying HP blade servers.
HP’s leading interconnect architecture is known as Virtual Connect FlexFabric. There are two main components to this:
- server profile virtulization: Virtual Connect Service profiles allow one to take attributes of a server such as WWNS, MAC addresses, FC boot parameters, etc and store them as a software construct, thus making the hardware itself “stateless”. The Cisco UCS analog to this would be Service Profiles. For a deep dive into the differences, see here
- virtualizing the 10Gb adapter port: allowing one to present up to 4x NICs to the host OS with traditional Flex10 or 3x NICs and 1x FCoE with FlexFabric interconnects. Cisco’s analog to this would be their “VIC” card which allows one to create up to 256 vNIC and vHBA and present them to the host. There are some technical differences between Flex-10 and Palo, but that is not the focus of this post either. Plenty of information out there on that subject easily available via Google.
First, lets take a look at what a HP BladeSystem architecture utilizing Virtual Connect FlexFabric architecture could look like:
The components here are 1x C7000 chassis with 16 blades utilizing FlexFabric interconnects and intgrated FlexFabric LOMs which give 2x 10Gb CNas per blade. The bottom most diagram represents a logical view from the OS perspective of a single blade. FlexFabric allows the administrator to divide a single 10Gbps CNA port into 4 devices: 3 NICs and 1HBA or 4 NICs. In this case, we have chosen 3 NICs and 1HBA to illustrate the FC/FCoE case. The operating system sees a total of 8 devices, 4 per CNA port. The OS communicates with the CNA as if it they were traditional NICs and HBAs. The FlexFabric LOM then combines these the NICs and HBAs into a FCoE stream and sends it through the midplane of the chassis up to the FlexFabric interconnects. The FlexFabric interconnects then split the FCoE traffic into their traditional Ethernet and Fiber Channel via seperate ports and send them upstream out of the chassis. In this case, a pair of Nexus 5Ks is used which has the ability to house both LAN and SAN ports. This Nexus switch could also uplink into a “core” LAN/SAN. Many architectures are possible upstream. Note that while the LAN connections are cross connected between switches, the SAN connections are *NOT*. This is because traditional fiber channel design relies on this “air-gapped” connectivity to maintain 2 separate fabrics.
Let’s contrast this with a HP BladeSystem deployment utilizing the B22HP-FEX:
This block diagram is very similar. The bottom most figure represents a logical view of a blade from an OS perspective. Unlike the FlexFabric configuration, when utilizing HP-FEX, the administrator does NOT have the option of creating 4 individual devices per CNA port. It defaults to a “regular” CNA adapter presenting one NIC and 1 HBA per port. The administrator will have to use other means of providing QoS since all the LAN traffic will travel through a single interface on the OS side. The classic example is creating multiple interfaces for VMware deployments — service console/VMotion, Production VM, backup etc. Another notable difference is the traffic is FCoE out of the chassis, where as in the FlexFabric design, it was getting broken out into its LAN/SAN counterparts. In this example I used the same number of ports for the upstream connectivity. The B22HP-FEX talks FCoE to the upstream 5Ks, which can then connect into “core” LAN/SAN infrastructures in larger deployments.
Notable differences between the architectures:
- in the FlexFabric deployment, you have the option of creating up to 4 interfaces per CNA port. On the FEX design, you do not have this capability.
- the service profile features offered by Virtual Connect is available in the FlexFabric deployment, but not in the B22HP-FEX deployment. This is a big deal since one of the major selling points to a HP BladeSystem is the ability to utilize Virtual Connect to abstract away the server hardware.
- in the FlexFabric deployment, you have to decide up front how many Ethernet and Fiber Channel connections you want upstream of the chassis. In the FEX design, since the traffic leaving the chassis is FCoE, you do not have to make physical wiring changes in order to allocate LAN/SAN bandwidth — it can be done via SW in the upstream Nexus 5Ks
- both the FlexFabric interconnects and B22HP-FEX offer 2:1 oversubscription — meaning there are 16 ports downstream, 1 per blade; and 8 ports up stream or .5 per blade. However the ability to utilize vPC in the FEX on all the links allows MUCH better utilization of the links. Because some (2) of the FlexFabric connections will be chewed up for chassis interconnects to create a single virtual connect domain, you actually have a higher (worse) over subscription ratio in the FlexFabric case.
- from a points of management perspective, the B22HP-FEX interconnects are not managed individually. They act as remote line cards in the 5K (just like the standard Cisco 2000 series FEX). Each FlexFabric interconnect (pair) on the other hand is a point of management
The lack of blade profile virtualization is a MAJOR downside to utilizing the FEX in HP BladeSystem. I don’t think anyone will argue that the FEX based network architecture is cleaner and simpler ESPECIALLY at scale; but customers will have to choose between a superior network arcthiecture, or the benefits that come along with blade profile virtualization…. unless they decide to go with Cisco UCS, in which case they can have both. That being said, there are clear advantages and disadvantages to going with either design, so its going to be up to the customer to decide what is more important to them.