by David Gross
Chris Mellor at The Register has a good analysis of some of the challenges of developing End-to-End FCoE networks. The few FCoE implementations that have shipped to date bring Fibre Channel and Ethernet together at the CNA, and then split the Ethernet and Fibre Channel traffic at an Ethernet switch, allowing the LAN bytes to go one way, and the Fibre Channel bytes to go their own way back to the SAN. End-to-End wouldn't just mean one NIC/CNA as current FCoE does, but one switch, and one network. Reminds me a lot of the God Box concept we saw 10 years ago in telecom networks, and ATM's promise of LAN/WAN integration in the mid-90s.
There are a number of operational challenges relating to frame prioritization. The IEEE is addressing this in part through 802.1Qbb. However, Fibre Channel transmissions are not like the 150 byte trade orders that often fill InfiniBand networks. As I mentioned in this morning's article, FC is the long freight train of data networking, allowing up to 65,536 frames per sequence. At 2,112 byes per frame, this means everyone could get stuck behind a sequence as long as 138 Megabytes while it crosses the wire. Kind of like sitting at the railroad junction in your Toyota while you wait anxiously for the caboose go by. Now it's one thing to do this on specific server-to-switch links, but across the entire network?
Additionally, in order to get around the additional congestion created by Spanning Tree, a routable protocol will be needed to open up ports that would otherwise be disabled to prevent looping, a point Mellor mentions in the Register article. But what he doesn't mention, and what I don't get, is how the switch manufacturers will deal with the cost of the added memory needed to handle this. It doesn't matter if you decide to go with TRILL or Fabric Path, if you're storing routes in a table, you'll need more memory in the switch, which can add significant hardware costs. While Clos architectures are mostly used in supercomputing, not enterprise data centers, they are designed to limit memory requirements, because they don't force each switch to build a table of all known routes across a network. This makes the switches more cost effective (it's one reason why InfiniBand and ToR switches go for less than $500 per 10G port). Switch memory is a precious resource, and no one wants to see data center switches heading for the price levels of layer three boxes that can hold 300,000 BGP4 routes.
The device to handle these TRILL requests is called a Router Bridge, or RBridge. Reminds me the "switch routers" of the early 2000s that targeted telco networks and offered switching capabilities at high router prices. The RBridge is going to need some expensive ASICs to handle all the added features the IEEE is developing, in addition to more memory for the routing tables.
With the economics already looking very challenged, Mellor's piece in The Register ends by pointing out that End-to-End FCoE will also require merging the SAN managers with the LAN managers. I've lived through attempts to merge IP specialists with optical specialists, and they never made it anywhere. While Fibre Channel and Ethernet experts have more in common with each other than packet and optical transport ops., most FC people are exceptionally knowledgeable about tape drives, RAID arrays, and other storage technologies in addition to FC itself. I wouldn't want to be at the meeting where they're asked to also be general networking experts, while giving up control of the SANs they know so well.
All in all, I'm extremely skeptical of end-to-end FCoE. It sounds good in theory, it looks good in PowerPoint, but with higher ASIC costs, added memory costs, not to mention attempts to tie together different operating groups, it will likely make diverged networks look even better in cost comparisons.