VM Escape Is NOT Your Main Worry

Too many times, virtualization and cloud security folks hear that VM Escape is the main worry of security teams. This is far harder to do than most people realize, and requires the attacker to bust through multiple layers of defense in depth! If security teams are worried about VM Escape, then they really do not trust their own defense in depth. They may not even be able to articulate their defense in depth. They may even be confusing VM Escape with Admin Escape. They may just be using this to produce FUD so that they can say no to change. Well, the latter never works. We need to get over this obsession with VM Escape.

First, let us look at hypervisors. I know many of you know this already, but there is a distinct difference between VirtualBox and Xen or KVM as well as between VMware Workstation or Fusion and VMware vSphere. The major difference is that VirtualBox, Workstation, and Fusion run within another operating system, and Xen, KVM, vSphere, and Hyper-V do not. Actually, the later run within their own kernels specifically designed to eliminate VM Escapes wherever possible. Do not consider these entities the same. Do some research. If you hear about an escape, find out how was it done, on what platform. Is that a platform you use?
Secondly, how does one do a VM Escape? There is no magic. One VM cannot talk to another over anything but standard network protocols. A VM Escape implies one has not just jumped between VMs but has written and executed arbitrary code deep within the CPUs controlled by more than one VM. This implies you need to manipulate the hypervisor directly. There is the perception of VM Escape, and the reality.

VM Escape Perception

VM Escape Reality

The perception is that you can just hop from one VM to another without doing anything too strenuous. The reality of a VM Escape is very different than the perception. The following steps need to happen before an escape occurs:

One must hack the VM in some fashion.
One must hack administrator on the VM in some fashion without crashing the VM.
One must hack from the VM to the virtual machine manager (VMM), which is often a large sandbox for running the VM. Once in the VM, one needs to hook onto something that is unknown (there is no shell here). It is in effect a no-man’s land. This is where one would normally try to execute arbitrary instructions that do not crash the VM.
One must hack from the VMM to or through the hypervisor.
One must hack the CPU or hack through the CPU. In order to jump between VMs, one often has to jump between CPUs.
To jump between CPUs, one must hack the hardware system manager, which protects itself. One wrong move here, and one’s hardware will fry itself via the thermal protections in place.
One must then somehow hack the target VM once more to run arbitrary code on the target VM.

It is far easier for an attacker to perform steps 1 and 2 above than to perform steps 3 through 7. As such, the counter to a VM Escape is to properly control access to virtual machines, management tools, and administrative users, regardless of where they live: in the cloud, in the data center, or even within a SaaS. Defense in depth is the only defense you need worry about when thinking about VM Escape.
In fact, an Admin Escape is far easier than a VM Escape. An Admin Escape is when the administrative user of the VMs or the management tools has been breached. If you breach the administrative users of the cloud management tools, then a VM Escape is not needed. You have effective access to everything: virtual disks, networks, even memory in some cases. This is why the lowest hanging fruit is to protect the admin. An Admin Escape takes one step:

One must hack the administrator of the management component of the environment. In vSphere, that would be vCenter, in Xen it would be XenCenter, in KVM it would be libvirt or oVirt, and for Hyper-V it would be System Center.

However, many companies allow anyone to access these management tools. They allow any network to access those tools, and they allow any device. This is the problem, not VM Escape. To perform a VM Escape, I need admin access to the VM. In many cases, admin access to a VM gives you the credentials you need to gain admin access to the management tools. How do you ensure that the proper people have that access, that they are not good electronic copies of administrators? How do you control admins? Monitor them? Use different passwords, usernames, etc. for the different types of admins?
A host of defense in depth is required. It all starts with some form of Management Access Security Broker (MASB), which is a management-specific form of a Cloud Access Security Broker (CASB). A MASB tied to good authentication and authorization practices is the core to a good security posture for your entire hybrid cloud. One MASB is HyTrust, for vSphere environments. It is worth checking out. Controlling the admin limits any attempt at a VM Escape. Check out vmescape.com for a growing list of resources to read about why this is just not a big deal compared to Admin Escape.
Remember, to escape a VM in reality, one must first hack the admin (whether for the VM or for the management of a cluster of virtualization or cloud hosts). If someone is asking about VM Escape first, or even in their top five questions, they are really asking the wrong question, which shows they have fallen prey to the FUD.

Hello Red,
This attack came out in 2015 and was never proven to affect all hypervisors. So once more, we look at the hypervisor in use. We also have a mitigation strategy that provides some level of defense in depth. This is just one link in the chain here. You need to take this attack and add several others to complete the escape. The getting of memory addresses, is a crucial second step here, but what do with it once you have it? That is what is missing in this exploit. You need more than one exploit to get a true VM Escape. Not always that easy to do. Even if scripted, those scripts have never been 100% possible. It depends on a set of variables outside the scope of those scripts (even within metasploit).
In order for the aforementioned attack to succeed, you need to first allow memory deduplication to occur (in vSphere terms, enable Transparent Page Sharing). All hypervisors have had for years now the ability to disable this feature. So for this attack to work, you would (a) need to break into a VM (b) determine if memory deduplication is enabled (c) attempt the attack (d) use the base address in another attack. Or combine b/c and see if you get anything useful then attempt (d).
So if we can prevent (a) and mitigate against (b), even if (a) succeeds then (c) cannot, which implies (d) does not either. We therefore have defense in depth. Does this limit the effectiveness of Virtual Machines within any hypervisor or cloud? No. There is no magic happening here. Does this imply I should always disable memory deduplication, perhaps, I do in most cases (always within a DMZ), and do not in others. It depends on your other defense in depth controls.
In nearly every possible VM Escape, there is some way to mitigate against the attack that is a part of any defense in depth. It starts with identity and moves done to hardening quite quickly. Hardening of the application and the VM so that arbitrary code cannot run (such as the code necessary to do this part of an escape). Hardening of the hypervisor to mitigate potential issues. It also implies having other well known controls in place to mitigate against well-known attacks that allow access to the application or the VM.
Once more, if you are worried first about VM Escape, then you do not trust your existing defense in depth. Defense in depth does not stay static, it changes all the time. Keeping your defense-in-depth up to date is an ongoing effort, that does not stop once the initial hardening and controls are laid down. It is a continual effort. It also requires continual research and reading of all fine print.
All the hypervisor vendors are updating their hardening guides to account for anything they discover. This is a continual update as well. Granted, they do not always publish continual, yet the updates happen all the time.
In your example, none of the major cloud players were actually affected by this attack. It produced FUD when people did not read the fine print and we all know how often that happens.
Best regards,
Edward L. Haletky

2 replies on “VM Escape Is NOT Your Main Worry”

Red says:

January 7, 2017 at 3:35 am

I’m with you when you are trying to remove some of the ignorance around the subject of security in virtual environments. Unfortunately you resorted to claims of bad faith which I think lost some of the power of what you were saying.

They may just be using this to produce FUD so that they can say no to change.

To address your general point, I agree that it is not your main worry. As an attacker, let me count the easier ways for me to cross security boundaries between your VMs. Compliance mandated SSO, credential re-use, application trust relationships… all equally apply to physical hosts.
I do however disagree with two of your blanket assertions. That worrying about crossing security boundaries in a virtual environment is:
a) Extremely Difficult.
b) Shows a lack of trust of defense in depth.
You’re conflating complexity with difficulty. They’re not the same. Finding ways to wag the tail on one dog to make another bark as you say is very complex. However, once that complexity is wrapped into a metasploit module it’s no longer difficult.
Virtualizing environments increases attack surface, that’s physics.
Are we going to see more VM escape exploits in the future. Undoubtedly. For example, a week before you posted this article Ben Gras, Kaveh Razavi, Brainsmoke and Antonio Barresi demonstrated several rather smart side-channel attacks which abuse memory de-duplication between VMs to do cross-vm data leakage as well as cross-vm data writing attacks. (https://www.kb.cert.org/vuls/id/935424/)
Bottom line.
Should you avoid using virtualization because of some (almost certain) future bug? In 99.9% of cases, no you shouldn’t.
There are HUGE security advantages to VMs, especially around instrumentation and Incident Response which more than outweigh the additional risk that being in a VM produces.
There is no panacea… there’s only technology and it is our job as engineers to help the business choose the right technology for each part of their business.
1. Edward Haletky says:
  
  January 7, 2017 at 12:05 pm
  
  Hello Red,
  This attack came out in 2015 and was never proven to affect all hypervisors. So once more, we look at the hypervisor in use. We also have a mitigation strategy that provides some level of defense in depth. This is just one link in the chain here. You need to take this attack and add several others to complete the escape. The getting of memory addresses, is a crucial second step here, but what do with it once you have it? That is what is missing in this exploit. You need more than one exploit to get a true VM Escape. Not always that easy to do. Even if scripted, those scripts have never been 100% possible. It depends on a set of variables outside the scope of those scripts (even within metasploit).
  In order for the aforementioned attack to succeed, you need to first allow memory deduplication to occur (in vSphere terms, enable Transparent Page Sharing). All hypervisors have had for years now the ability to disable this feature. So for this attack to work, you would (a) need to break into a VM (b) determine if memory deduplication is enabled (c) attempt the attack (d) use the base address in another attack. Or combine b/c and see if you get anything useful then attempt (d).
  So if we can prevent (a) and mitigate against (b), even if (a) succeeds then (c) cannot, which implies (d) does not either. We therefore have defense in depth. Does this limit the effectiveness of Virtual Machines within any hypervisor or cloud? No. There is no magic happening here. Does this imply I should always disable memory deduplication, perhaps, I do in most cases (always within a DMZ), and do not in others. It depends on your other defense in depth controls.
  In nearly every possible VM Escape, there is some way to mitigate against the attack that is a part of any defense in depth. It starts with identity and moves done to hardening quite quickly. Hardening of the application and the VM so that arbitrary code cannot run (such as the code necessary to do this part of an escape). Hardening of the hypervisor to mitigate potential issues. It also implies having other well known controls in place to mitigate against well-known attacks that allow access to the application or the VM.
  Once more, if you are worried first about VM Escape, then you do not trust your existing defense in depth. Defense in depth does not stay static, it changes all the time. Keeping your defense-in-depth up to date is an ongoing effort, that does not stop once the initial hardening and controls are laid down. It is a continual effort. It also requires continual research and reading of all fine print.
  All the hypervisor vendors are updating their hardening guides to account for anything they discover. This is a continual update as well. Granted, they do not always publish continual, yet the updates happen all the time.
  In your example, none of the major cloud players were actually affected by this attack. It produced FUD when people did not read the fine print and we all know how often that happens.
  Best regards,
  Edward L. Haletky

Comments are closed.