Rethinking vNetwork Security

The most recent Virtualization Security Podcast was on Rethinking vNetwork Security and featured Brad Hedlund of Cisco as the guest. The conversation started on twitter two weeks ago and lead to Brad posting a write-up of his discovery’s on his InterNetorkExpert blog titled The vSwitch Illusion and DMZ Virtualization. The podcast went even deeper into the technology and may have come up with a solution to what is becoming a sticky vNetwork problem with some organization’s security policy. However, what it all boils down to is Trust. Where do we place our Trust, but without full knowledge of how the vNetwork stack operates, this trust could be misplaced.
Brad asked the question, should the physical network security policy be different than the virtual network security policy? The answer is obviously no, but why are they treated separately? I and other have pushed the concept that to gain performance, redundancy, and security that you should use multiple network links to your virtualization host to separate traffic. However, does this really give you security?
This is where Trust comes into play. However, Trust cannot exist without full knowledge. So we need to look at things in a bit more detail to determine the full extent of our rethink. Brad Hedlund’s vSwitch Illusion post is a great starting point, it surmises that each vSwitch within a VMware ESX host is nothing more than a single control plane with multiple forwarding structures, 1 per vSwitch, and we can further surmise that there is a second tier or forwarding structures, 1 per portgroup, within a vSwitch.
Background
Since this represents a single control plane and multiple forwarding structures it really looks like multiple vSwitches are really nothing more than VLANs but not the normal ones we see within a pSwitch. Below the control plane of the vSwitch is the actual vmkernel network stack which itself contains common code shared by all network drivers before we come to the actual drivers for the physical adapters. If all the adapters are the same make and model then only one driver is in use. If multiple makes and models of adapters are in use then different ‘drivers’ are possibly also in use. This depends on how many devices one driver works. For example, many of the Intel drivers such as the e1000 driver work with many different makes and models of Intell Gigabit adapters. Figure 1, shows this stack from the VM down town the driver. In this example, we have two distinct drivers in use.

Figure 1 shows that vSwitch1 and vSwitch2 are ‘separate’ in some way, these blocks actually represent the separate forwarding structures used within the vSwitch Control Plane with vPG4 and vPG5 representing the second tier structures used by portgroups. So we asked ourselves in the podcast, given that vSwitch1 and vSwitch2 are separate forwarding data structures is this really just a VLAN implementation with some built-in protections. The answer we came up with is that for the VMware vSwitch this is the case. There is really only ONE VMware vSwitch in use per VMware ESX or ESXi host.
This is also the case for the Cisco Nexus 1000v, but that has always been the case and Cisco did not try to hide this fact. However, regardless of vSwitch control plane in use: VMware vSwitch or Cisco Nexus 1000v, there has always been one Network Kernel Stack in use and below that generally separate drivers as we discussed earlier. So even though we have networks split by VLANs, the kernel uses the SAME code eventually, and the data is no longer separate but segregated by data structures.

VMware vNetwork with VMware vSwitch and Cisco Nexus 1000v — Figure 2

Now let us look at Figure 2 to the left and see how the network stack shapes up when you use a VMware vSwitch and a Cisco Nexus 1000v within the mix (This is also the cases if we look at the VMware Distributed Switch). In this case there are two vSwitch control planes, one for VMware and one for the Cisco Nexus 1000v, however the Network Kernel Stack remains the same, as do the drivers in use.
What this means is that virtual networking provides a logical separation of packets or segregation of packets into what is commonly referred to as VLANs in the physical space. At some point in time it is possible that the network packets from multiple VMs will commingle either within the vSwitch control plane, or the network kernel stack or even within the drivers in use.
Other Physical vs Virtual Differences
There are several other issues we need to discus with respect to virtual vs physical networks. The most important being the security layers that exist within the two environments. When you have more than one virtual switch as we do in Figure 1, we should be aware the the vSwitch Control Plane will not allow interaction between those two virtual switches except in the case where the traffic is bridged either within the physical layer or the virtual layer via the use of a virtual machine. Unlike portgroups which can talk to each other on the same vSwitch construct (but not across constructs), the use of multiple virtual switches prevents this basic communication. The analog within the physical environment is the use of multiple physical switches. This is a logical and programmatic security aspect of using multiple VMware vSwitches and our first Security difference. Even though there may be one control plane for all vSwitches, the control plane prevents data on one vSwitch construct from reaching the other vSwitch constructs.
The other differences is that out of the box a VMware vSwitch is not susceptible to many Layer 2 attacks that exist by default in many physical switch configurations, granted you can harden your pSwitch to prevent these attacks but out of the box, layer 2 attacks are generally not prevented within the pSwitch fabric as they are in the vSwitch fabric. This is because vSwitches are authoritative about what is actually connected to them.
Layer 3 protections exist within the Cisco Nexus 1000v and many physical switches, such as Dynamic ARP Inspection, but not the VMware vSwitch, which is expected as the VMware vSwitch is a Layer 2 device only and VMware relies on VMsafe-Net to perform any Layer 3 intrusion prevention or detection.
Where do we place our Trust?
Given these concerns, where do we put our Trust. This depends entirely on the answers to the following questions about your security policy as well as your physical network implementations:

Do you accept the risk of VLANs on your current physical network?

If you answered yes, then whether there is one vSwitch control planes or multiple should not make much of a difference as VLANs are an accepted practice.

Do you require physical separation for your DMZ security zone from your production and management security zones?

If you answered yes, then we now have to consider the vSwitch control planes, and the built-in, programmatic security of the VMware vSwitch and Cisco Nexus 1000v. Regardless the traffic is only segregated within the Kernel Network Stack and perhaps only Segregated within the drivers in use. Regardless of where you look there will not be a physical separation between traffic within any virtualization host. We can segment the vSwitch control planes as in Figure 2, and use different class of drivers to segment that traffic, but the Kernel Network Stack is not separated just segregated.

Given that at least the Kernel Network Stack is shared between vNetwork components on any virtualization host, no matter the vSwitch or pNICs in use, we ultimately have to Trust that this layer is secure. If we use the same make and model of pNICs then we must then Trust the driver is secure. If we only use one vSwitch, then we must then Trust that the vSwitch control plane is secure.
If you choose to Trust the vSwitch control layer and use multiple VMware vSwitches, you are trusting that the VMware vSwitch control plane security, of no vSwitch being able to talk to another vSwitch, is maintained.
Where does that leave us?
If you want to minimize any possible bleed over between two distinct networks on your virtualization hosts, I suggest you look at using multiple vSwitch types (VMware vSwitch, VMware Distributed Switch, and Cisco Nexus 1000v) as well as distinct classes of network adapters for each of your separate networks (ala Figure 2). In this way you have protected your pNIC/driver layer and your vSwitch control plane layers. Even so, the Kernel Network Stack remains shared between the layers and difficult to protect.
When you consider using VMware ESX or any hypervisor to host a DMZ, it is important to realize that you have no method of true physical separation eventually you must minimally trust the vmkernel network stack layers are doing the proper thing and that the code is secure.
Brad Hedlund’s article is about ensuring that you have one security policy based on logical separation of networks for both the physical and virtual networks. We need to take that one step further and truly understand what is happening under the covers to provide as many compensating controls as necessary. This includes understanding the innards of the hypervisor virtual network. Then we need to make the necessary decisions to plan for redundancy, performance, and security.

3 replies on “Rethinking vNetwork Security”