vSphere 4.1 Released – More Dynamic Resource Load Balancing

With the release of vSphere 4.1, VMware has added an impressive set of new features. These generally fall in the areas of improved manageability, improved performance and improved scalability. In essence, vSphere 4.1 is more than a point release, this update includes many features that aid in security, reliability, and is a direct response to customer requests.
Overview of What is New
vSphere 4.1 contains many enhancements, grouped in the areas of Availability, Security, Scalablity, vCompute, vStorage, and vNetwork as shown in the diagram below.
vSphere 4.1 What Is New
Storage I/O Control
VMware has added to their Dynamic Resource Load Balancing (DRLB) suite of tools that was hinted at in Dynamic Resource Load Balancing. The DRLB features are low enough level to apply to any workload but really come into play when there are heavy loaded systems and clusters. Yes, clusters. VMware has spent considerable time on implementing Storage features that will aid entire clusters as well as single hosts. Storage IO may no longer be as large a bottle neck as it once was with vSphere 4.0.
Storage IO Control (SIOC) writes latency data to the header of a VMFS datastore that is then used to determine if SIOC should come into play. If there is a high latency (20ms or greater) then SIOC comes into play. To make use of SIOC however, you must reconfigure the per VMDK shares on those VMs that need to have more storage IO than others. Now Disk Shares have two distinct different meanings. One is how a given host handles IO for a VM in and out, and the second is how IO is handled across a cluster based on latency numbers. One control mechanism to control tradition disk IO and SIOC. Eventually, I hope there is a way to control both independently.  So how do you know if there is latency in your storage sub system? VMware reports this back within the vSphere Client or you can use the handy vKernel StorageView tool which will show you latency for up to 5 datastores at a time.
SOIC
vStorage API’s for Array Integration
vStorage API for Array Integration (VAAI) allows storage arrays to handle repetitive storage tasks with minimal transfer over the wire. In other words, a single instruction to write gigabytes of zeros instead of having to waste time sending those zeros down to the array controller. There will be other VAAI constructs coming in the near future that will also speed up various aspects of storage. VAAI will do for Storage what Intel-VT/AMD RVI did for CPUs: remove repetitive tasks from the data flow thereby allowing more bandwidth across storage clusters and improve overall storage performance.
Network I/O Control (NOIC) and Load Based Teaming (LBT)
DRLB has also been improved within networking with 2 new features that come into play when there is some form or network contention within a Virtual Distributed Switch and only a Virtual Distributed Sitch (vDS). Since vDS spans an entire cluster of vSphere hosts, DRLB could have a few more tools in the toolbox when discussing clusters. At the moment these tools are host only.

  • Load Based Teaming (LBT) provides a way to control which outbound pNIC to use by a given vNIC. When a VM starts up its vNIC is bound to a pNIC according to the currently set load balancing rules on the vSwitch. In the past, the pNIC in use never changed. LBT will change this binding as needed (no more than every 30 seconds) if there is a perceived network contention on an outbound pNIC.
  • Network IO Control (NetIOC) provides traffic shaping on the outbound pNIC per VM. While traffic shaping exists for entry to a VM and exit from a VM, NetIOC provides traffic shaping for outbound from a host per VM. Once more we have similar controls between one method of traffic shaping and another.

LBT and VAAI have immediate use if your array supports VAAI and you make use of Virtual Distributed Switches. NetIOC and SIOC provide the tools to control contention across Network and Disk resources. They are not integrated directly into DRS, which covers CPU and Memory resource contentions, yet. Could SIOC be used to judge when to automate Storage vMotion? I would say it could but at the moment you would need to write your own scripts.
Over-Commit
VMware has set the stage for better utilization of memory with a Memory Compression over commit option. This fits into the over-commit stack of tools before swapping to disk:

  • Content Based Page Sharing (Transparent Page Sharing – TPS)
  • Memory Balloon
  • Memory Compression
  • Swap to Disk

Memory compression is 10000 times faster than swapping to disk, but still much slower than the other two. While TPS is an idle time collapsing of 4K memory pages, memory compression is a real time compression of memory pages using most likely something like gzip functionality. Which algorithm is really unimportant as the overall memory savings allows you to run more VMs without actually swapping to disk. Once you swap to disk performance really tanks.
Memory Compression
The implementation of Memory Compression also paves the way for other memory related tools as I outlined in Safe way to Encrypt within a VM – Need for Technology. While memory encryption would not provide over commit capability it would enhance Secure Multi-Tenancy.
Security
VMware has also enhanced security within ESXi as well as ESX by pulling into VMware hosted the Likewise (http://www.likewise.com) Open Active Directory integration functionality. This provides full AD integration including the use of GPOs, etc. Prior to v4.1 we either had half-measures or complex to implement full measures. Now there is an easy way to join the management appliance/service console to an AD domain for unified groups, users, and policies. Now if they would then store the Roles within AD there could be a unified Role Based Access Control that spans vCenter to ESX and ESXi.
ESXi  has been enhanced to allow SSH to be enabled and disabled for a set time limit. This functionality will greatly improve over all support capabilities. There is no longer a need to go into ‘tech support’ mode to control whether SSH is enabled or not, you can do this from the vSphere Client now. Actually, ‘tech support’ is no longer the name of that mode. It still exists, but the nasty messages have been removed. It is a valid and usable mode to access the console directly as needed.
ESXI Access Controls
vMotion
EVC and vMotion have been improved over all as well. You can not perform more simultaneous vMotions and there is now a new EVC mode for the AMD chips that do not contain 3DNow! support.
Installation
VMware has also enhanced the installation of ESXi to include the ability to script the install using a kickstart file similar in nature to the one used by ESX. This will improve the ability to rapidly and consistently deploy ESXi.
Impact on Virtualizing Business Critical Applications
VMware has dramatically extended the capability of its platform in this release in terms of the ability of the platform to support the virtualization of the next frontier of business critical applications. However, VMware has not yet shipped any form of an integrated management suite that takes advantage of these new platform features to provide the kind of infrastructure performance assurance, applications performance assurance, and automated operations that will be required to support these applications in production.  The question of how the vendors in the VMware ecosystem will continue to add value to the vSphere platform will be explored in detail in a subsequent post. However, the short answer is that the vendors in our Solutions Showcase will have an increasing important role to play as vSphere 4.1 gets used to virtualize more than tactical applications.
Conclusion
There are many many enhancements to ESX and ESXi with the release of vSphere 4.1 so before you upgrade I strongly urge you to read the release notes! before upgrading.
DRLB, over-commit, security, vMotion, and installation are the big improvements. These pave the way for other tools to be created. The one security aspect I wished was in v4.1 was not. I feel that ESXi’s management appliance should be a VM or that VMware utilize the Determina purchase to secure the ESXi management appliance.
VMware is moving the hypervisor into the future, it is no longer just about a single host, but about the cluster or hosts.