Nutanix OS 3.5: Deduplication, New GUI, SRM, Hyper-V Support

Bob Plankers August 28, 2013 4 Comments

Nutanix, one of the fastest growing IT infrastructure startups around, shows no signs of slowing down with their release of Nutanix OS 3.5. For those not familiar with Nutanix, they offer a truly converged virtualized infrastructure. This generally consists of four nodes in two rack units of space, where each node has CPU, RAM, traditional fixed disk, SSD, and Fusion-IO flash built in. Their secret sauce is really NDFS, the Nutanix Distributed File System, built by the same folks that created Google’s File System, as well as a unified, hypervisor-agnostic management interface.

Nutanix OS 3.5 adds what they call the “Elastic Deduplication Engine,” an inline deduplication facility for RAM and flash that aims to be able to cram more data into those expensive, fast storage areas. They claim that they’ve been able to store up to 10x the amount of data in RAM and flash as a result, which has a very positive effect on storage performance, most notably latency. It also helps to drive up the VM density, especially in environments where there is a lot of common data, such as VDI deployments.

While they say their deduplication approach is extensible to all storage on their platforms, actual implementation on other types of storage is left to a future version. This feature set, while a welcome addition, does trail some of their competition. For example, Simplivity’s OmniCube Accelerator deduplicates once, then stores that deduplicated data on any tier of storage.

Version 3.5 of their OS also brings with it much-desired GUI changes, implemented in HTML5 for cross-platform browser compatibility and aimed at making IT staff lives easier by simplifying the environment. They call it a “consumer-grade user experience”, which is something more IT vendors should strive for. GUIs are oft-overlooked, an afterthought implemented by an engineer who will never use the product on a day-to-day basis like the customers will. By focusing on decent UX design many human errors can be reduced, root causes of problems can be found more readily, and IT staff can go to work not dreading the tools at their disposal.

Nutanix OS natively offers replication services between Nutanix clusters as part of NDFS, and this release adds compression to that, as well as the ability to be controlled by VMware’s Site Recovery Manager (SRM). In order to initiate failovers, SRM uses “Storage Replication Adapters” (a fancy name for what are often just Perl scripts) to bridge the gap between SRM and the storage array APIs. This is an attractive feature for larger enterprises which have built their DR/BC/COOP runbooks around SRM already.

Nutanix says there are over 75 enhancements in OS 3.5, which will be available to all customers with active maintenance contracts beginning in September. The last new feature I find notable is their beta support for Microsoft Hyper-V Server 2012, as well as making KVM support a first-class citizen on their platforms. This is great news for non-VMware shops and shops exploring additional hypervisors and cloud platforms like OpenStack.

Overall, this release is a great one, and it shows Nutanix has their eye on the problems that actually plague IT, coming up with clever and cost-efficient solutions to them in software while also delivering a range of hardware to meet enterprise and small-scale IT needs. They’ve become one of the poster children for converged and software-defined IT. I am anxious to see what they might do with other areas of IT in the future, like networking, as “software-defined” makes its way deeper into our data centers.

4 replies on “Nutanix OS 3.5: Deduplication, New GUI, SRM, Hyper-V Support”

Jeremy says:

August 28, 2013 at 10:36 pm

In my opinion, Nutanix and Simplivity are the leading hyperconverged infrastructure products. I was much more interested in Nutanix at the start of evaluation; however, discovered a very efficient and extensible architecture in the Omnicube.

The Omnicube performs deduplication and compression at data inception before writing to (any tiered) storage. Importantly, this preserves CPU cycles for actual workloads versus the incurred overhead of “post processing” data optimization. The Nutanix storage elements have been redesigned three times, from FusionIO then two additional iterations of Intel flash. I understand that this is Nutanix evolving, but critical ‘hot’ storage integration should have been thoroughly vetted.

Simplivity’s replication behavior is proactive in the context of HA. A failed-over VM is immediately able to start on the next Omnicube. The Nutanix replication scheme is reactive, with the destination failover node requesting the remainder of the failed VM’s composition data from the cluster before starting it. Scale this behavior to many VMs and/or a stretched cluster and it’s a valid consideration.

The Omnicube is backup ready without additional licensing and its software stack is compatible with Amazon’s EC2 for public cloud replication. The Omnicube’s clustered NFS storage can be mounted by existing servers whereas NDFS cannot. Simplivity’s simplification of the virtualization stack and respect for business requirements is greatly appreciated–especially when paying the premium for *hyperconverged* capabilities.

Finally, the Dell-based Omnicube costs less than the SuperMicro-based Nutanix product and support is included. For these reasons, Simplivity is the solution of choice.
Dan says:

August 30, 2013 at 2:56 pm

Who do you work for, your post sounds like a scripted Simplivity FUD sheet for Nutanix?

Disclaimer: I work for Nutanix

Some additional thoughts. Nutanix delivers all intelligence and data management at the software layer, without relying on any hardware crutches. Our first generation system used Fusion-io, back when there weren’t many options for server-attached flash. Intel jumped into the game with PCIe-SSD and more recently a blazing fast SSD and because our intelligence is defined in software we were quickly able to utilize this new hardware in our new generation of systems (1000, 3000, and 6000).

Your statement on HA is incorrect. During a host failure, vSphere HA will respond as any system with shared storage and power on VMs from the failed host on other hosts in the vSphere cluster. Nutanix will seamlessly direct that VM to its replica dataset, no matter where in the Nutanix cluster it resides. vSphere HA doesn’t miss a beat (repeat – no delay in VM power on) and Nutanix NDFS will perform a background migration to bring that VMs data to the local node to provide data locality for high speed storage performance. This brief example shows the difference between a true scale-out file system like Nutanix NDFS and a federated group of servers like Simplivity Omnicube.

I would be happy to go into more depth if your local account team isnt already engaged.
Steve says:

August 30, 2013 at 3:03 pm

Agreed, I believe both of these companies are moving the hyperconverged infrastructure space forward and I’m looking forward to seeing where each of these go. However, I’m not sure if I completely agree with all of your statements below..

Compression at inception definitely makes sense… for non-random workloads (aka sequential). How is it possible to truly compress data at inception efficiently for randomized workloads in my opinion its not truly possible to do this efficiently. This would absolutely kill random IO performance and increase random write latency. But for sequential, absolutely.

Nutanix will detect the IO pattern (aka random v. sequential) and only compress sequential streams of data thus keeping random write performance high. As the un-compressed data is cooled (eg. becomes un-accessed as part of ILM) the data then becomes eligible to become compressed.

How is it possible to perform dedupe before any data is written to persistent storage? I’d assume that data is written persistently to some sort of media (whether write-back or write-through is being used as a write buffer). Is Simplivity only deduping data on ingress for data sitting in the buffer? If so, how can that be effective?

From my perspective this would absolutely add additional overhead primarily for random IO worloads which would lead to increased write latency. Understood, the propriatary FPGA used by Simplivity does alleviate this, however because it is done in-line this will still affect write latency.

What happens with that FPGA fails? That’s a single point of failure.

With the new Elastic Dedupe Engine that Nutanix released data is only fingerprinted on ingress and deduped on read.

I’m wondering how switching manufacturers implies storage element re-design? They use a purely software defined methodology and can work with any vendor/storage provider.

Where did you get your data that a VM could not be automatically started during a HA event? That is false. The VM will start immediately it will just be initially be reading data which could be remote initially and will then be brought local to provide localized IO. All write IOs are immediately localized.

Also, doesn’t Simplivity use pairs for redundancy? How will that scale? With Nutanix all of the nodes are utilized for replication which is much more effective especially when talking about scale.

Agree with you on the cloud integration, I want this. However, in RE: to existing ESXi hosts mounting their NFS I know this is possible.

When I make my decisions I A) technically vet them and B) base my decisions upon industry perspective and utilization. I’ll wait for Simplivity to A) be a proven production worthy system B) Reach an $80 million run rate (aka show scale and size) and B) embrace the software defined storage space since FPGA != software defined
Jeremy says:

September 9, 2013 at 7:15 pm

Dan,

I work for a midsize manufacturing company. Both the Nutanix and Omnicube appliances have hardware “crutches” (as you say) as well as proprietary software dependencies. This is a part of the risk/reward equation for converged infrastructure appliances.

Emphasizing “blazing fast SSD” (versus what previous iteration?) is compensation for sub-optimal ‘front-side’ data optimization. What’s better than efficiently processing data from write initiation throughout all storage tiers and preserving CPU for actual workloads? The super car analogy is like raw horsepower (Lamborghini) versus efficient dynamics (GT-R).

You didn’t state anything new about HA. vSphere HA functionality isn’t the focus; it’s about cluster mechanics and proactive design for host failure. VMs in an Omnicube cluster (what they call a federation) are distributed in real time, immediately able to activate on the failover target. In contrast, the “Nutanix NDFS will perform a background migration to bring that VMs data to the local node”. Your words, not mine.

Each solution features 10Gb fabric, high-end processors and storage. I agree that HA restart of a single VM would have no perceptible delay. HA restart of eight (at a time) large VMs would better demonstrate the initialization delay and this is where I believe Simplivity has the edge. A stretched cluster case will further accentuate the performance difference and utility of a VM ‘ready start’ design.

Steve Kaplan,

Meow, why are you saying “they” like you don’t work at Nutanix? 😉 I appreciate your attention. The Omnicube scales out and all nodes are used for replication. The minimum Omnicube deployment is two appliances versus Nutanix’ three. I don’t have the answer to your random IO question. Still, microsecond latency penalties may be a justifiable tradeoff for more CPU workload capacity.

Regarding your closing points: A) any infrastructure tech startup with promising diffusion characteristics must be production proven before mass adoption—including Nutanix. You wouldn’t be the Nutanix VP of Channel/Strategic Sales, at least for long, if early adopters were waiting to see Nutanix proven in a production environment. B) Run rate matters to you and the risk averse; however, far less to the early adopters evaluating innovation on technical merit. These people are aware of meaningful innovation and the strategic industry implications. C) The principle of software defined is terrific but “embrace the software defined storage space”? For the sake of your run rate? I’m buying on design and performance. If the Omnicube delivers the data IO characteristics and feature set that my business needs using FPGA then what’s the problem?

Comments are closed.