The Virtualization Practice recently moved their systems to the Cloud, being cost conscious we chose one of the public clouds to use. The reality of such a move is much different than the hype. We expected stellar support, better performance, improved security, improved DR, and 5 9s uptime, and the hypervisor is a commodity. In essence, it should be better than we could do ourselves. That is the promise of the cloud; the hype of the cloud. What we have seen is something far different.
Which IaaS vendor we chose is relatively unimportant, but the main take away from all this is that what you put into the cloud you will get from the cloud and unfortunately the hypervisor still matters. Which underlying hypervisor that is in use dictates how you bring your applications to the cloud and perhaps your data. Our local environment uses VMware vSphere (and we were just about to move everything into vCloud when we had to move our datacenters). In choosing a cloud we went with the following requirements (yes it is an ordered list):
- Cost
- Service Level Agreement
- Ease of Getting our Data to the Cloud
- Monitoring
- Automation
- Ease of Use
However, what we got was something entirely different than what we had. We have a laundry list of problems but not just with our chosen cloud but with a number of clouds we looked into:
- One Cloud quoted us $800 per VM, which was cost prohibitive, when we went back to them, we finally got a price that seemed reasonable but still high.
- Some clouds did not post their SLA anywhere we could view. Which to me meant the cloud did not have an SLA available at all, I should not have to ask for it.
- No cloud we looked at had available (even though it was planned) a replication receiver cloud using the three installed backup and disaster recovery tools with Veeam being one of the installed tools. Not even for an additional fee. Some claimed to have Zerto support but the need was immediate, it was months away. So much for using replication as a means to move our data.
- Monitoring ended up being faulty in our chosen cloud, due to the underlying hypervisor the monitor they have would never allow us to know if there was a problem and we had problems.
- All clouds have handy automation, but not for deployment of my workloads, but for critical features such as rebooting a virtual machine, access to the console to debug over a secure mechanisms, or creating a virtual machine from a well known (to the cloud) template. Intricacies of your application are what you bring to the cloud and are generally out of scope for any automation. However, once installed, there is usually a means to snapshot the workload for future deployment.
- All clouds should be easy to use with no need for me to call service just to find out how to handle things that normally pop up. An iPhone or Android App is a nice touch.
But we still had problems once we got into the cloud:
- We did not have an automation script for deploying our applications with all their intricacies. This is a must when you go to the cloud, as what works as expected in the data center may miss something when deploying into a public cloud. We overlooked a operating system configuration that caused a VM to crash, given this was a business critical application, crashing was not an option. However, this seems to be hypervisor specific, Xen did not handle the memory usage as well as VMware vSphere. Regardless, the VM should never have crashed, and if it did we should have been notified. This violated the SLA, and is where the cloud’s monitoring failed.
- Performance was atrocious. It is still worse than our own infrastructure, the application runs 2x slower in the cloud.
- Security is just not there. The cloud does security better is always the comment, well my experience is that it does not. Our application was hacked almost immediately (easily fixed thankfully). We had to add more security into our workloads to make them more secure. Security that ends up hampering functionality.
- Disaster recovery is available but it is a crash consistent backup of your workloads. Many clouds we looked at had no real virtualization aware backup mechanisms.
Teething problems are now over, and our first foray into the cloud to run workloads had some teething problems. They are fixed and the workloads have been up and running, but it took our knowledge of the application to fix them, not the knowledge of the cloud support folks. In reality, they were very little help. Which leads me to the next big concern about moving to the cloud: support.
Support within the cloud is not stellar, they know their systems extremely well but not the underlying mechanisms that make everything tick. For example, our workload should never have crashed, but this is an underlying issue within CentOS vs RedHat as well as Xen vs vSphere. However, cloud support folks do not know your applications. Support for them is left entirely in your hands. Which may not be what you want. This may drive the need for community or specialized clouds so that support knowledge does not need to be as broad as necessary for the public cloud.
The public cloud reality is much different than the hype. You only get from the public cloud what you bring to it. You need to bring your own security, performance monitoring, knowledge, and expertise. And unfortunately the hypervisor still matters.
From my experience working at a service provider supporting traditional enterprise environments including Private/Managed Clouds as well as a Public Cloud, you have valid advise here 🙂