Agent and Agent-less Backup in the Virtual Environment

There is some debate amongst backup vendors on what defines an agent, some consider any amount of scripting to be an agent, while others imply it is what does the data transfer plus any amount of scripting necessary.  Is there a need for both Agent and Agent-less within a virtual environment? This also begs the question, who is responsible for properly handling the application whose data you are backing up?
Modern virtual environment backup tools take a few steps to complete:

2011 09 13 07 36 45 300x192
Virtual Environment Backup Mechanism
  • Communicate the desire to create a snapshot to the hypervisor
  • Software communicates that a snapshot is needed (A)
  • Hypervisor communicates the need to quiesce the disk to the virtual machine
  • Hypervisor takes the snapshot (B)
  • Backup happens
  • Software Communicates that backup is finished (C)
  • Hypervisor commits the snapshot (D)
  • Hypervisor communicates that the snapshot process has ended to the virtual machine
  • Back Complete

Of these steps 2 of them communicate with the virtual machine, so that the virtual machine can properly handle the application.  For Microsoft Windows this may be just using VSS which is Microsoft product aware and therefore produces non-crash consistent backups, while other non-Microsoft applications could be crash consistent.
What does Crash Consistent mean? It means that when the data is restored bits and bytes of data may be missing as that data may not have been written from memory to disk. In the case of a database, this could be a large amount of data. Crucial data could be lost. In essence, the backup could be nothing more than swiss cheese with lots of holes in it. If this is a full disk backup, there is also a chance that the VM may not boot.
The goal of any backup tool should be to eliminate such issues. However, failure to communicate to the application that a backup is to occur is an issue. Does the backup software you use integrate with your applications in any way to ensure all data within the application caches are written to disk?
On VMware vSphere the communication to the virtual machine that a snapshot will be taken or has been committed only happens if VMware tools is installed and up to date, if not then these actions are not communicated to the VM and you end up with a crash consistent backup. Which should be avoided at all cost. So, there needs to be some way for the backup tool to communicate with the applications within the VM to quiesce the application, sync data from memory to disk, so that the process can be complete before the snapshot is taken. For VMware vSphere, the VIX API could be used to communicate through the hypervisor to execute per VM commands with the proper authentication within the VM. For other hypervisors, other techniques are required such as agents, WMI, remote command execution (ssh), etc.
Nor can the backup vendors ignore Linux in favor of Windows and visa versa, there is a growing number of mission critical Linux machines being virtualized.  If you are backing up mission critical applications within the virtual environment application quiesce is very important. Many backup vendors depend on the hypervisor to perform this communication and the guest operating system to do the proper thing. However, if that communication is not guaranteed to happen, or the proper thing is not good enough or non-existent (such as no VSS on Linux machines), then the backup will remain crash consistent unless such a script can be written by your administrators.
This is where tools such as Symantec NetBackup and other legacy backup tools work best, they already know you need to tell the application to quiesce and then only backup the data for the application. These tools also contain agents that in some cases virtual environment aware.
Agent-less backup is great for filesystems, but not necessarily for applications, as such it may be time for the virtual environment only backup vendors to consider application agents not just for restore, but also for backup for all operating systems. Depending on the hypervisor to do the work, is a catch-22 as the hypervisor vendors only support their subset of applications and operating systems and not all the applications and operating systems in use within the virtual environment.