The Effects of Virtualization on Data Center Power and Cooling

Although virtualization technologies have been a great help to data center managers looking to reclaim power in an overburdened environment, virtualization can also create problems. As IT organizations complete their server virtualization initiatives and as their virtualization management skills mature, use of the physical server power management features and dynamic workload shifting capabilities will increase. This, coupled with increased adoption of server hosted desktop virtualization, will create new opportunities for improving the efficiency of data center power and cooling systems, as well as new challenges for keeping up with increasing demand.

In 2006, volume servers (x86 servers performing duties such as file, print, and authentication services and web, database, and e-mail hosting) were responsible for approximately 85% of all data center server power consumption.^[i] In the past three years, somewhere between 15% and 20% of these servers have been virtualized, leading to a significant reduction in power consumption. Expectations are that this trend will accelerate over the next three years until upwards of 90% of volume servers have been virtualized.

Looked at strictly from a data center power and cooling perspective, virtualization’s primary benefit is concentration of workload from a larger number of poorly utilized servers onto a smaller number of better utilized servers. Consider the following:

A typical 2-processor, 8-core server blade such as the HP bl460c consumes about 185W at zero load, yet under full load, the same server will consume just under 500W — less than three times the zero-load power consumption.^[ii]
With many production servers running at less than 10% of their maximum capacity, a typical server blade will consume less than 200W.
Under typical utilization, server consolidation of eight to 10 servers is a realistic possibility.

If as few as four or five of these server workloads are concentrated on a single physical server, the power utilization for these workloads would be more than halved. However, while reducing the overall power consumption, the per server load (the power draw and heat output) would more than double. Overall power consumption is down, but load per square foot is increased. Depending on the design parameters of the data center and the presence or otherwise of secondary factors that influence server distribution, it may be necessary to optimize the data center environment to ensure that power and cooling distribution is aligned with the higher power and heat density. Older data centers that are designed for lower power draw per square foot may not be able to support a fully loaded rack of current generation servers running at maximum load. Even if the data center’s rated capacity exceeds the predicted load, additional work may be required to ensure that the cooling system is adequate to that task. Inadequate raised floor clearance, or a raised floor that has not been correctly maintained, may constrain airflow to the extent that it is not possible to deliver the required airflow to all racks.

As time progresses, the percentage of volume servers in the data center will inevitably increase to the point where it approaches or even reaches 100%. During this time, data center managers will be able to continue to tune the data center power and cooling systems to maximize their efficiency, but there will come a point where further improvements in cooling efficiency will become hard to achieve without introducing new techniques.

Dynamic Workload and Power Management of Virtual Servers

Dynamic workload management has been offered by virtualization management system vendors such as CA for several years, with server power management offered as a more recent extension of this capability. However, adoption of these services has been limited with few customers willing to deploy autonomous systems in production. Even accepting the limited options currently available, the largest obstacles to adoption of autonomous systems are more likely to be related to people and policy rather than technology. An operations team trained on the requirement to follow change control process may struggle to adapt when faced with a system that can independently move workloads and turn power servers on and off as it sees fit; the dissonance will take some getting used to. Positioning of a virtual infrastructure as a single “organic” system that is subject to change control at the system level, not at the component level (i.e., server or server workload), may be necessary to avoid the ambiguities it calls attention to.

Dynamic power management does present one other significant concern:

A report by AMD^[iii]states that “A data center with an average heat load of 40W per square foot can cause a thermal rise of 25˚F in 10 minutes, while an average heat load of 300W per square foot can cause the same rise in less than a minute.” With high-density server blades putting out up to 500W each under peak loading, powering on large numbers of blades in response to a sudden increase in demand can present challenges for a cooling system designed on the assumption that load levels will fluctuate slowly.

Desktop Virtualization

The emergence of server hosted virtual desktops (SHVD) is unlikely to be welcomed by data center managers. SHVD will result in a significant increase in data center power and cooling load as power draw is transferred from the business office, where it is today, into the data center. SHVD is similar in concept to Microsoft Windows Remote Desktop Service (RDS) except that it provides users with a full copy of a Windows desktop operating system instead of access to a shared Windows server operating system. Delivery of a full Windows instance requires more server resources than a RDS session and results in a significant increase in power consumption in the data center, from 1W to 2W per session for a RDS session to between 6W and 9W for a SHVD session on the same hardware.^[iv] SHVD also requires a significant support infrastructure in the data center, most notably in storage, which in turn may constitute another significant power draw. While adoption of SHVD services may well result in a net reduction in power across the enterprise, it could be difficult for many data centers to accept this additional demand.

Recommendations

The effect of virtualization on data center power and cooling is rapidly evolving. Until recently, virtualization had few negative consequences in the data center beyond the need to accommodate high power loading and thermal density. The introduction of dynamic power management for virtual server platforms and the adoption of desktop virtualization both offer the potential to significantly modify this position. Prior to deployment of these technologies, enterprises should take these measures:

Review power distribution strategies/technology in anticipation of server consolidation – Virtualization generally results in higher server utilization and hence greater power consumption. The data center power distribution network should be assessed to ensure it can accommodate higher load density.
Review cooling strategies/technology in anticipation of server consolidation – Many older data centers were not designed to accommodate high thermal loading and may not be able to accommodate dense blade server configurations without remediation.
Work in partnership – Work with business and facilities stakeholders to gain acceptance of the concept of autonomous server workload management. Organizations that have developed mature and established change control rules based on precise awareness of all operational activities may struggle to accept the uncertainty that follows ceding of control of data center operations to autonomous systems. Similarly, owners of configuration management database systems (CMDB) may be reluctant to accept the fact that their systems can no longer accurately report the precise configuration of the virtual infrastructure at any moment in time. Do not expect to be able to develop a solution in isolation, and do not expect buy-in if announced as a fait accompli.
Limit deployment to non-production systems – Dynamic power management of virtual server environments is still in its first generation of products with little opportunity for feedback from field deployment. Organizations considering dynamic workload management should therefore limit deployments to non-production systems until sufficient confidence is gained and vendors have had the opportunity to address any shortcomings. Avoid targeting hot-standby disaster recovery or business continuity environments, even if they appear as easy wins with large rewards.
Ensure that you have vendor support – Newer server and blade power supplies are usually able to accept frequent power cycling without any negative consequences. However, to prevent possible mis-communication both internally and with vendors; it is important to obtain written confirmation from vendors that frequent power cycling does not impact mean time between failures (MTBF) nor invalidate warranty agreements. Few, if any, vendors are experienced at responding to this type of request, and it may be necessary to make repeated requests and escalations before a satisfactory response can be obtained.
Look for fail safe (fail-on) operation – Any dynamic power management system must fail-on (i.e., the system must have a means of autonomously powering on all servers if the management system fails) if it is to be considered for use in support of environments where a service level agreement (SLA) cannot be met through manual intervention (i.e., scripted use of Wake on LAN or powering on by hand from the data center floor).

References

[i] Report to Congress on Server and Data Center Energy Efficiency Public Law 109-431 US Environmental Protection Agency https://astroarch.wpenginepowered.com/wp-content/uploads/2010/10/EPA_Datacenter_Report_Congress_Final1.pdf date of publication August 2, 2007, accessed online September 20, 201

[ii] HP BladeSystem Power Sizer http://h20338.www2.hp.com/ActiveAnswers/cache/347628-0-0-0-121.html accessed online September 20, 2010

[iii] AMD Power and cooling in the Data Center http://www.amd.com/us/Documents/34146A_PC_WP_en.pdf date of publication March 29, 2007 accessed online September 20, 2010

[iv] Entelechy Associates September 2010