In part one of Cost to Build a New Virtualized Data Center, we discussed the basic software costs for a virtualized data center based on VMware vSphere 6.0, Citrix XenServer 6.5, Microsoft Hyper-V 2012 R2 and 2016, and Red Hat. If you missed that, please click here to review before continuing.
This post will take that original premise and expand it to include storage with a view to moving the entire environment toward a software-defined data center.
Once again, our compute unit of choice is the Dell 730xd with two 10-core CPUs and 256 GB of RAM. Now, we need to add some local storage in each node. This compute node can, depending on the choices made during the configuration, take up to twenty-four disk drives. For the purposes of this article, we assume that data locality is required for performance, and that there is a need for an all-flash array. We have chosen to go with two 400 GB SLC drives for cache and four 800 MLC drives for capacity. This means that there is a total raw capacity per node of 4 TB. There may be a requirement for further hardware, depending on the chosen solutions for each hypervisor, but that will be called out in the relevant vendor section. Due to the length of this article, we have split it into two sections. This post deals with the costs surrounding vSphere and Hyper-V.
Data Center Based on VMware vSphere
In part one of this series, we noted that the costs for a pure vSphere data center, with regard to software, came in at $90,540. This includes three years of service and support. It must also be noted that, as the majority of enterprises and businesses are still heavily Microsoft-based, there would be additional costs of up to $115K for the Microsoft Windows 2012 R2 Datacenter edition, to cover unlimited virtualization rights for Server-based operating systems.
VMware has a product called VSAN that works in the kernel of an ESXi host to enable the presentation of local direct-attached storage as a shared resource. There are three editions, these being Standard, Advanced, and Enterprise.
Standard | Advanced | Enterprise | |
Overview | Hybrid hyper-converged deployments | All-flash hyper-converged deployments | Site availability and quality of service controls |
Product Components |
|||
License entitlement | 1 CPU or Per VDI Desktop | 1 CPU or Per VDI Desktop | 1 CPU or Per VDI Desktop |
Storage Policy-Based Management |
● |
● |
● |
Read/Write SSD Caching |
● |
● |
● |
Distributed RAID (RAID 1) |
● |
● |
● |
VSAN Snapshots & Clones |
● |
● |
● |
Rack Awareness |
● |
● |
● |
Replication (5-min RPO) |
● |
● |
● |
Software Checksum |
● |
● |
● |
All-Flash Support | ● | ● | |
Inline Deduplication & Compression (All-Flash Only) |
● |
● |
|
Erasure Coding – RAID 5/6 (All-Flash Only) |
● |
● |
|
Stretched Cluster |
● |
||
QoS – IOPS Limits |
● |
Other considerations are the requirement for a supported storage controller from the VMware HCL. One of the main requirements is for a passthrough IO controller. Here we hit our first issue. Currently, there are no Dell-supported controllers for VSAN 6.2 when you require an all-flash array. Hybrid? Yes. This means that with our current compute unit of choice, we could not utilize VMware’s latest and greatest version of the VSAN, but instead would have to downgrade to version 6.1 with a significant loss of features and performance. So, we are adding an assumption that Dell has now successfully completed testing and a supported controller is available, and that we actually chose a supported controller when we originally purchased the machines. Other assumptions are that we need optimum performance and maximum utilization.
The next decision that we need to make is which version we need to fulfill our requirements, these being:
- Optimum performance
- Maximum utilization
- Snapshots and clones
- Storage-based policy
Referring to the table above, we can see that requirements three and four are covered in all three versions. However, to gain optimum performance and maximum utilization, we will need either the Advanced or the Enterprise version, as these give us the ability to use all-flash as our data storage devices, and to use erasure coding and deduplication. We have no requirement for a stretched cluster at this time, and although QoS would be nice on a per-VMDK basis, it is something that we can live without for now. Therefore, we will pick the middle ground and go with the Advanced edition, at $3,995 per CPU. Now, one of the things we do like about VSAN licensing is that it is not capacity-based. Therefore, once you have purchased your CPU, you can grow your per-node capacity without penalty.
Next, we need to figure out how many nodes will need to be licensed with a VSAN license. The minimum needed for a valid VSAN deployment is two nodes and a witness. However, this would not satisfy our requirements for maximum utilization, as we would need to set our redundancy to mirror immediately losing 50% of our capacity. So, we will need to look at adding erasure coding to the mix, but should it be RAID 5 or RAID 6? If we move to RAID 5 erasure coding, we will need a minimum of four nodes (three data and one parity), and with RAID 6, a minimum of six nodes (four data and two parity). It is at this time that we need to consider the number of nodes that we, as data users, are able to tolerate failing before data loss and how this will affect the amount of capacity utilized.
Let’s look at an example 50 GB disk:
- 50 GB disk with FTT =1 & FTM=RAID 1 set –> 100 GB disk space needed
- 50 GB disk with FTT =1 & FTM=RAID 5/6 set –> ≈65 GB disk space needed
- 50 GB disk with FTT =2 & FTM=RAID 1 set –> 150 GB disk space needed
- 50 GB disk with FTT =2 & FTM=RAID 5/6 set –> ≈75 GB disk space needed
As as we can see, the capacity savings can be significant, especially when we look at a failure to tolerate of two. We can see a potential savings of 50%.
From this, we can see that the minimum number of VSAN Advanced licenses we will require is 12 CPUs. However, to minimize later disruption, we are going to fully populate our ten hosts.
Uplift costs for software:
Product | Description | Number | Unit cost (USD) | Sub-Unit |
VSAN Advanced | Perpetual 1 CPU license with 1 yr software maintenance | 20 | $3,995 | $79,900 |
Subtotal | $79,900 |
Now, remember that potential savings could be made here if we just licensed the minimum number of nodes required for a valid cluster, as non-VSAN nodes can access a VSAN datastore. However, remember our requirement for optimum performance. This, by association, would mean data locality. This would only be available if all nodes were licensed and participated in the VSAN cluster. So, to sum up VMware’s costs:
The initial virtualization cost from part one is $90,540, and the VSAN cost is $79,900. So, the total VMware running costs are currently $170,440, not including any Windows data center costs needed to run Microsoft operating systems on the environment.
Data Center Based on Microsoft Windows 2012 R2 Hyper-V
Things start to get a little muddier when we look at Hyper-V. Hyper-V, unlike VMware, does not have a hyperconverged product, which mimics VSAN. However, there are other third-party products that can fill the role. One such product is DataCore’s Virtual SAN. The product’s concept is similar to that of DataCore’s SanSymphony product. However, it is installed on a Windows machine as a VM, or on Dom0 on a Hyper-V host.
First, it is important to understand that the DataCore Virtual SAN is not a like-for-like with VSAN. It is only mimicking the concept of an in-kernel storage target: it is actually an application running on the underlying Windows operating system that Hyper-V utilizes. DataCore also utilizes real RAM as the caching layer in its storage product. This could mean less memory for Hyper-V hosts, and therefore more licensing costs and physical hardware. So, we are adding an assumption that all compute and memory requirements are met with the current ten nodes.
The architecture of DataCore’s product is such that redundancy is only per node pair, as they use failover clustering for resilience. However, this is in active/active mode, so both nodes in the pair can actually do useful work. At the same time, it does mean that it is difficult to maximize capacity utilization, as we are at a 50% utilization point due to a mirrored pair scenario. This is mitigated somewhat by the fact that deduplication, thin provisioning, and compression are included with the base license. One thing to note is that deduplication is done post data commit, not inline. Now, there are benefits to post and inline deduplication, but that is not the focus of this article.
Again, we have our requirements of:
- Optimum performance
- Maximum utilization
- Snapshots and clones
- Storage-based policy
We have already found out that maximum utilization is going to be limited, but we can work around that with deduplication and compression. For optimum performance, we will require data locality, so once again, all ten nodes will need to be licensed. DataCore’s product also provides storage-based policy management and snapshots and cloning technology.
Next, we need to understand the licensing matrix for DataCore. This product is licensed per node; however, there is a managed storage capacity part to add to the picture. The way DataCore licenses its products is refreshingly simple. There is a cost per node based on the capacity of managed (or seen) storage under control, and then a fixed cost for service and support. The licenses available are 4 TB, 8 TB, 16 TB, 32 TB, etc., up to and including a 1 PB license. If you outgrow your capacity in a particular node pair, you just upgrade that node to the new capacity level license and uplift your service and support cost. There is no penalty for using SSD devices over traditional HDD devices. Also, it is just the difference in price from your previously licensed level to the new level. We like this clarity.
We will still have the requirement for optimal performance; therefore, data locality is a given. This means that we will need each node to be licensed.
Product | Description | Number | Unit cost (USD) | Sub-Unit |
DataCore Virtual SAN | Perpetual Server node license for 4 TB | 10 | $2,000 | $20,000 |
DataCore Service and Support | 3 years service and support | 10 | $1,000 | $10,000 |
Subtotal | $30,000 |
This will result in five datastores that can be attached to each node, unlike the single datastore that appears with VSAN. With DataCore the running total is now $173K.
Data Center Based on Microsoft Server 2016 Hyper-V
Things start to get interesting here. With the release of 2016, Microsoft has entered the hyperconverged market. One of the new features of the latest release is Storage Spaces Direct. This is an evolution of Storage Space that was introduced in Server 2012 and enhanced in 2012 R2. We at TVP think that this is interesting technology. It has similar requirements to VSAN, and is summarized below:
- All Servers and Server components must be Windows Server Certified
- A minimum of four servers and a maximum of sixteen servers
- Dual-socket CPU and 128 GB memory
- 10 GbE or better networking bitrate must be used for the NICs
- RDMA-capable NICs are strongly recommended for best performance and density
- If RDMA-capable NICs are used, the physical switch must meet the associated RDMA requirements
- All servers must be connected to the same physical switch or switches
- Minimum two performance devices and four capacity devices per server, with capacity devices being a multiple of performance devices
- Simple HBA required for SAS and SATA device; RAID controllers or SAN/iSCSI devices are not supported
- All disk devices and enclosures must have a unique ID
- MPIO or physically connecting disk via multiple paths is not supported
- The storage devices in each server must have one of the following configurations:
- NVMe + SATA or SAS SSD
- NVMe + SATA or SAS HDD
- SATA or SAS SSD + SATA or SAS HDD
What is interesting here is that if one is (as we are) already licensed for the Datacenter edition, then this is an inbuilt feature, there for no extra cost. This is a compelling story and does not change our running total of $143K
Part 2a Summary
We have moved now into the arena of hyperconverged environments. The costs are starting to rise. Our costs for our vSphere-based environment have risen to over $170K in software licensing alone, not including costs for Windows licensing that may need to be added. It is true that the Hyper-V 2012 R2 cost also rises to roughly $173K, as there is not a like-for-like option. However, the cost of the third-party option is not as high as the VSAN option. That may not be the case if the capacity starts to rise, as the simple addition of 2 TB per node will double the cost of the DataCore solution. Things really start to get interesting when you look at Windows 2016. (Note that this is still in tech preview, so some features may change.) Here, we have the ability to utilize local storage across a Hyper-V cluster of up to sixteen nodes. With no tax for capacity, and no tax for adding it to a Hyper-V environment, this makes for significant value.