Many people seem to think that in this brave new world of converged infrastructure and software-defined everything, the era of standalone storage and networking is coming to an end. Indeed, it’s becoming quite popular to think differently about storage. There are new types of clustered and distributed storage options, like Ceph and Gluster, that rethink the way storage is delivered and built. There are virtual storage appliances (VSAs), like the HP StoreVirtual VSA and NexentaVSA, that essentially replicate standalone hardware in a virtual machine. There are also hybrid approaches, where companies like Nutanix, Scale Computing, and Simplivity deliver a clustered file system that’s tightly integrated, via virtual machine, with their products.
All of that is pretty interesting, and in terms of functionality and price it leads to a wider range of choices for IT shops. I think there are three main points to keep in mind, though, when talking to vendors about their virtualized, clustered, software-defined storage.
Virtual Storage Appliances Need Dedicated Resources.
VSAs are completely different than your average virtualized workload. Virtualization tends to rely heavily on overcommitment, banking on the fact that not all your workloads are going to be running at 100% all the time. What happens when your storage is competing for resources along with your workloads, though? What happens when those workloads are using the storage that’s being starved of resources? Nothing good.
As a result, overcommitment isn’t possible for VSAs. For best results, VSAs need 100% fully-committed resources, which often means giving them decent amounts of reserved RAM and CPU, and pinning them, using affinity rules, to two or more physical CPUs. Giving them affinity for a particular CPU reduces the impact of the context switch, which is the name for the time it takes for a CPU to stop working on one process and load the data for another process. Context switching is a non-trivial task, especially if a workload moves around in such a way that a CPU’s L2 & L3 cache is ineffective. Most workloads don’t notice this happening, but most workloads aren’t providing the underpinnings for your entire virtual infrastructure, either.
A good example of this is Nutanix. Their clustered filesystem is implemented as a virtual machine on each physical node. Each physical node has two six-core CPUs, and the virtual machine providing the storage consumes two of those cores, as well as 10 GB of RAM. This might be an acceptable tradeoff for the feature set and price, but it does need to be a factor when doing sizing and ROI calculations. For example, VDI use cases often talk about how many virtual desktops per core. If you’re planning on 14 VMs per core for a virtual desktop implementation that’s 28 fewer desktops per node because of the virtualized storage.
Virtual Storage Appliances Need Decent Underlying Storage And Management.
VSAs don’t magically produce storage on their own. They need decent amounts of storage present on the host, properly sized both for capacity and performance. With traditional monolithic arrays, it was often easier to figure out the performance requirements, because you could aggregate the performance of all the disks, and the controllers had more dedicated channels to access the disks. Your average direct-attached disk array usually passes through a SAS expander before reaching the drives, which drastically overcommits the bandwidth between the drives and the controller. Traditional monolithic disk arrays also tended to have more onboard cache than per-host direct attached storage does, too, and could use it more flexibly. The best local disk controller from Dell, the H710, has 1 GB of cache on it, and two 6 Gbps links. On a Dell PowerEdge R720xd those two 6 Gbps links service up to 24 drives.
Virtual machines’ I/O requirements are are also not homogenous. One VM might be consuming 1400 IOPS while the next is doing 25. When you split these workloads up among individual physical hosts you need to account for that. As a result, you might spend more time than you were expecting managing and troubleshooting performance and capacity. You also might need to gather better performance data, if your operations are lacking in that area.
Virtual Storage Appliances Still Rely On Networking.
To be resilient, most VSAs and clustered filesystems rely heavily on networking between the hosts, to facilitate copying of data to another node in case the primary node fails. That’s an excellent idea. Some VSAs do this work asynchronously, some do it synchronously. Synchronous replication is safer, because it guarantees that a write is committed in both places before it’s acknowledged. However, synchronous replication does mean you’re at the mercy of your network latency. And virtualized networking often has some of the same limitations that VSAs have, particularly with resource overcommitment. Spend some time up front to make sure your network will perform and that the costs of retrofitting it are accounted for.
Don’t think that I am anti-VSA.
VSAs are great options for many IT shops. Like many of our new technologies, they just require a new set of questions and thought processes when we’re thinking about buying and deploying them, so we know what we’re getting into as we buy and deploy these solutions. Good luck!
If you’re running your storage as a VSA it’s only cause you’ve chosen vmware as your hypervisor.
The others you mentioned, ceph and gluster for instance, wouldn’t have to be in vm normally. They can run native next to most modern hypervisors.
@Theron — true. You pay for them with compute resources just like you pay for VMware-hosted VSAs, it’s just at a different level. They also need network connectivity and management from people, both of which can’t happen in the same way as you manage the virtual infrastructure because they’re outside of that.
As this type of technology becomes more popular, customers will need fast, low-latency connections between the VSAs. I see more use of 10GbE, 40GbE, and Infiniband for customers that need very fast synchronization or failover between nodes.