We all try to do it, we sometimes succeed, but the increased density of workloads escapes many folks, whether they are in a cloud or using an on-premises virtual environment. Are there ways to help us gain more density within our environments? Is it still fear that keeps us from doing so? Are there real issues we still need to solve? Why are most environments running with CPU to spare? Is there still a fear of running too many things on any one system?
There is fear. There is uncertainty. There is doubt. FUD rules the Enterprise even to this day. It is one reason we no longer run our hypervisor environments so dense. We fear to have too many of our eggs in one basket. This is also the reason many are not entering the cloud as quickly as others think they should. We fear having our workloads outside our control.
Virtual environments are designed to be highly available, to resource-schedule across nodes. There are tools to aid in what I termed, many years ago, “dynamic resource load balancing” (DRLB). DRLB, unlike VMware DRS, is a true load balancing of workloads across an environment. Where DRS is a contention-based reactive solution, DRLB is done with planning and forethought as well as a deep understanding of the environment. Properly used, DRLB would alleviate FUD as it refactors as workloads are added and moved around.
This refactoring could occur with an aim to move workloads to a minimum of resources, or to spread workloads across all available resources. Each approach is valid; each approach includes capacity and performance management. And each is represented by various tools. DRLB’s complexity increases when we speak of containers, however. We need to do DRLB within something other than a hypervisor, but within container operating systems that are either virtual or physical.
Ignoring containers for now, there are three tools that work with normal virtual environments to balance workloads either on the least amount of resources or across all resources: your choice.
Those tools are:
- Cirba, which provides algorithms to fit as many resources on as few hosts as possible. In essence, Cirba plays Tetris with your workloads according to simple or complex policy requirements: policy requirements that could claim an entire cluster that is off-limits to PCI or HIPAA workloads, or even to dev/test. Other policies would allow any of these types of workloads. Policy based on workload classification allows the algorithms to properly place workloads.
- VMTurbo, which provides cost-based algorithms to load balance workloads across all hosts within a cluster, and in some cases across clusters. VMTurbo’s algorithms look at all resources within a cluster and balance them across storage, memory, networking, and compute capabilities based the cost of those resources, allowing you to spend less.
- Virtual Instruments, which provides recommendations so that workloads can be moved around a cluster to provide better balance. Virtual Instruments’ approach is to move workloads around as a way of relieving storage-related issues. Yet, it provides visibility across a wide range of hypervisors.
These tools work with multiple hypervisors; these tools work to alleviate hot spots before they can impact your workloads. These tools work with various private cloud technologies. These tools unfortunately do not work with hybrid off-premises clouds yet. They are all moving in that direction.
These tools will also work in a container-based world that has one container per virtual machine. If there is more than one container per virtual machine, the tools will work, but they are VM-centric, not application- or container-centric. To be application-centric implies that the application (and its thousands of containers or virtual machines) can be determined automatically, imported in some fashion, or defined easily. It also implies that all aspects of the application are looked at, and not as individual VMs. These three tools, however, are gaining some idea of the application and how that application fits within a virtual environment, but only if that application is a well-defined one, using hypervisor constructs such as resource pools, folders, etc., and not the true definition of the application, which includes database tables, security measures, and network definitions, as well as the virtual machine or container-based application code.
And that is the rub; that is the issue. If the tools maintain a VM-centric view, they miss the rest of the environment. They do not understand the network except as a resource to be balanced across. They do not understand that there is a virtual security appliance that is part of the application to provide some form of network function virtualization (perhaps load balancing, intrusion detection, or a web application firewall) that is part of the application. They do not even understand which table(s) of a database belong to an application.
However, these tools are striving to get there. They can import definitions, tag systems, and grow to understand their environments better. Also, these tools are starting to look at external clouds, as well as business systems such as IBM PowerVM, as first-class citizens within their tools.
We have gone past plain capacity management to predictive workload placement within a multi-hypervisor world, as well as rightsizing those workloads across myriad platforms. In essence, we are playing continual games of Tetris with our workloads for better fit within our environments. Let these tools help you get over the FUD, allow your hypervisors regardless of vendor to be better utilized, help lower your overall costs, and locate hot-spots before they become major problems and outages. These tools allow your day to day operations to be predictive rather than reactive. Perhaps even freeing up time for research into newer technologies.