In June, I was in Boston for Virtualization Field Day 5, which was an amazing event. The sponsor presentations are usually awesome. The next best thing about Tech Field Day events is the conversations that you have with other delegates between the presentations. On one trip, Stephen Foskett wondered why none of the hyperconverged vendors has converged networking. All of the hyperconverged vendors use physical Ethernet switches. I spent the next half hour talking with Chris Marget about what the requirements might be and what networking technology might be used.

As we know, hyperconverged places storage and compute into each node, then builds both storage and compute clusters by joining nodes. The VMs that these solutions are designed to deliver consume four food groups of resources: CPU, RAM, disk, and network. Current hyperconverged solutions converge the first three. They rely on external devices to deliver the fourth. While it’s not mandatory, it is usual for the storage network to be a couple of 10 GB Ethernet switches.

The questions that Chris and I explored are: “How would you converge the network into the hyperconverged nodes? In particular, could you build a network between the nodes that would be used for all of the storage traffic between the nodes?” This is the largest network load in hyperconverged, replicating written data and fetching non-local reads. An existing network for VM and management traffic would be OK. The hyperconverged cluster traffic should be separated. I would consider the cluster network to be converged if the cables just went from node to node—no external switches or concentrators. The most obvious way to do this is with a mesh network. Each node would connect directly to every other node. The problem is that to mesh the network, you would need a cable from each node to every other node. Not a big problem with four nodes (six cables), but a real issue with a 64-node cluster (2,016 cables).

The other alternative is a loop cabling architecture. Each node connects to the next, and the last node connects back to the first to form a loop. I would go with a dual loop design for redundancy and so that expanding one loop doesn’t cause the whole network to stop. I’m also thinking more of FDDI loops than of token ring loops—node to node, without central switches.

This is where it got very interesting: Chris told me about Plexxi. Plexxi has interesting networking technology that uses passive optical interconnects. It allows every node to be one hop from every other node, without a sea of cables. Given that the interconnects are passive, I would guess that node to node cabling is probably also possible. What if this technology were integrated into the hyperconverged nodes in place of 10 GbE? Your hyperconverged cluster wouldn’t need 10 GbE switches, just optical cables looping. It does look like you could attach a router to the Plexxi network to get VM and management traffic, removing the need for 1 GbE to each node. Even more interesting is Plexxi’s technology to deliver the same layer 2 networks over a 100 km fiber to a remote data centre. This would be an immediate enabler for a metro cluster environment: a hyperconverged platform with an inherent ability to provide active/active clustering across data centres; live VM migration between data centres and HA between them.

Does it make sense to further converge the network within a hyperconverged solution? I suspect not. While 10 GbE switch ports are not cheap, they aren’t a huge part of your hyperconverged infrastructure cost. The network is still converged compared to using Fibre Channel for storage and Ethernet for VMs and management. For customers who want to minimize cabling, a hyperconverged node can work with three network cables (two 10 GbE plus 100 MbE for out-of-band management) and two power cables. Reducing the cables to two optical plus power isn’t a huge savings. Interesting discussions are educational, but not every one leads to a breakthrough.

3 replies on “Hyperconverge Your Network Too?”

  1. Hi Alastair,

    A very good read! Being situated in the same building as Nutanix in San Jose, I sometimes become an unintentional listener to an elevator conversations on this very same topic 🙂

    Hyperconverged infrastructure aims to eliminate traditionally silo’ed pieces of infrastructure, such as compute and storage creating a single converged horizontally scalable platform. As you had rightfully pointed out, main hyperconverged solutions of today had not yet significantly tackled the network, but one thing is clear is that network must not remain another form of silo in the world of hyperconverged infrastructure.

    IMO, this is exactly where the principles of network virtualization can come to the rescue. Rather than creating point solutions for the purpose of interconnecting elements of the hyperconverged infrastructure, what if we were to “carve” virtual instances of the existing network infrastructure for this purpose. Each virtual network catering to the needs of the hyperconverged infrastructure could be given characteristics for performance, throughput and scale equivalent to the purpose-built physical networks, while being locally segmented and maintaining lower operational costs with greater solution simplicity.

    We can go further than that. As elements of hyperconverged infrastructure become geographically dispersed, organizational Wide Area Networks need to be virtualized as well. This is where we are looking at technologies such as Software Defined WAN to deliver end-to-end secure segmentation, ubiquitous encryption and ultimate flexibility to deliver network capacity in the most economical way leveraging hybrid transport approach.

    I happen to work at one of the vendors in the SD-WAN space – Viptela, which had taken upon itself to provide a complete Wide Area Network virtualization solution, which can easily be utilized to the networking needs of the centralized or geographically distributed hyperconverged infrastructure.

    Thank you for reading.
    David
    @DavidKlebanov

  2. Hi Alastair,

    Great article, you’ve definitely hit on something that will start to become more and more apparent to more and more people, and definitely what’s driving our growth and interest in our technology.

    The uber simplicity of the HC model starts to get a lot less simple when you grow beyond the first few racks, and ideally it shouldn’t. In Plexxi’s case, we propose a model by which when you add a new rack, you click it in with 1 or 2 cables, and the network automatically configures itself _and_ can be told to tell the cloud orchestration layer about the new resources that have just been added. This model extends from rack to rack and data center to data center in exactly the same.

    Beyond the optical technology, and probably more importantly, is the software layer that is designed to understand workloads or classes of traffic, and provide specific network paths to those workloads/classes based on a business-driven policy. This changes networking from being about arbitrary L1/L2/L3 boundaries to being about delivering workloads that meet the expectations of the business. In this way, workloads can be assigned a specific set of paths that meet their performance and/or security needs similar to how VMs are assigned to cores.

    Its a great topic and one that we see increasingly as being the model by which customers care to consume infrastructure – less as silo’d compute/storage/networks and more as coordinated infrastructure driven by the need to deliver workloads.

    Thanks very much for the article and the insights!

    Cheers,
    Mat Mathews
    Co-founder & VP, Product Management

  3. I think that what both David and Mat are saying is that converging the network is not really about hardware but is all about software.
    Storage and compute could be converged into a single hardware device and scaled by adding more of these devices. The network remains separate as it must link each of these devices together.
    To me, the biggest payoff for Hyperconverged is the ease of management. I agree that SDN holds the promise of allowing simplified and policy based management. I simply haven’t seen an SDN solution that is suitable for a typical Hyperconverged customer to deploy. You cannot even download VMware’s NSX without first attending a 5-day training course. I expect that in a few years there will be SDN solutions that can be deployed by an IT generalist without weeks of training. Then SDN will have the simplicity that hyperconverged expects.

Comments are closed.