Since publishing VMware vSphere and Virtual Infrastructure Security: Securing the Virtual Environment, I have continued to consider aspects of Digital Forensics and how current methodologies would be impacted by the cloud. My use case for this is 40,000 VMs with 512 Servers and roughly 1000 tenants. What I would consider a medium size fully functioning cloud built upon virtualization technology where the environment is agile through the use of storage and VM vMotion or Live Migrations. The cloud would furthermore contain roughly 64TBs of disk across multiple storage technologies and 48TBs of memory. Now if you do not believe environment like this exist today, this was the size of the datacenter servicing VMworld 2009, This monster was on display just as you came down the escalators from the main entrance into the keynote sessions.
Now there are several issues with Digital Forensics within the cloud and therefore with any virtual environment. They are:
- The acquisition of data within the cloud
- How to collect the data
- What data to collect
- How to deal with data that is agile and may not be at rest in any one location
- How to respect the privacy of other tenants not under investigation.
Whilst at the Cloud Security Summit that took place during RSA Conference, I asked about Forensics within the cloud and the answer was interesting. In effect, if you are willing to pay for Forensics the cloud providers will most likely setup something for you. One such cloud provider, Terramark, is already working with the FBI to develop a system using off the shelf forensics tools from Netwitness and Guidance Software. However and this is the crux of the matter, in order for Forensics to be available these tools must be in place first and there may be a requirement for agents within your virtual machines.
How to Collect the Data
The state of the art for forensically sound collection of data is to collect a bit by bit duplicate of a disk image using a disk duplicator or network disk duplication tools from Guidance Software and other vendors (which requires an agent within the VM). If you want memory images, this requires an even more time consuming task of possibly freezing memory before removing power form a host to be duplicated.
However these mechanisms are not necessary within the virtual environment where you can grab disk and memory images quite easily via snapshot and other administrative functionality. That said, this method of acquisition has yet to be proven forensically sound.
How to Deal with Agile Data
USB sticks are currently considered the most agile units for forensics. Within the cloud data can live on multiple hosts via Live Migration, vMotion, High Availability, and Fault Tolerance technologies; or on different storage devices via Storage VMotion.
Traditional mechanisms imply that the acquisition of agile data either be limited or be all inclusive, which could lead to privacy and other issues such as how to store the TBs of acquired disks. There needs to be a new mechanism to determine if such historical data is required or a way to easily determine where a virtual machine has lived within the virtual environment.
Respect the Privacy of Other Tenants
The next item of note is that it is becoming increasingly clear that respecting the privacy of cloud tenants who are not under investigation is extremely important. A Digital Forensic Scientist cannot currently go on a fishing trip, they are asked very specific questions and can only answer those questions. Even so, it is very important that they only have access to the data currently under investigation and not data from other tenants.
Current forensic technologies do not consider or understand the concept of multiple tenanted systems or virtual and cloud environments, they are expecting the one tenant one physical host construct and this means that if there are multiple tenants on a given host, then they are then possibly in scope for the acquisition of data.
What Can we Do
Firstly you can plan for Digital Forensics from the start and ensure your cloud provider has the necessary tools in place and active for current state of the art perspectives, however as with all “improvements” this will often increase the cost to the end user, or the forensic investigation can be carried out with out such a setup, this however can, depending on how the technologies are integrated lead to the suspect being warned that a forensic investigation is being performed.
In the ideal world a Forensic investigator needs their tools to work at all times, now with the best will and intentions this can be usually be planned, the fact is that it is often the case is that they are working on systems that do not have the necessary tools to support the original ‘plan’ so a new plan is required. Therefore the state of the art for digital forensics needs to move with the times and consider and account for agile, virtual, and cloud environments.
very interesting article. but as i feels provenance (Secure Provenance: The Essential of Bread and Butter of
Data Forensics in Cloud Computing) will solve the problem of seperating data of multiple tenants.What is your view on using live forensic in clouds?
Hello,
I think using live forensics in the cloud is the only way to do forensics in the cloud, the issue I have with it is that modern forensics tools are missing the amount of data they can gather to only what they can use within the physical environment. That is too limited for the virtual environment needed for the cloud. If no hypervisor is involved then what is available is what you have to use.
As for provenance solving separation of data, I think this is not the case. You need real separation of data, encryption of data, and verification of data. A dialable level of security. Just relying on Provenance is not enough. We need real tools and configurations.
Best regards,
Edward Haletky