By Greg Schulz, Server and StorageIO @storageio
Depending upon what your or somebody else’s definition of a storage hypervisor is, you may or may not be using one or realize it.
If your view of a storage hypervisor is a storage IO optimization technology to address performance and other issues with virtual machines (VMs) and their hypervisors, such as Virsto or ScaleIO along with others, you might be calling those storage hypervisors as opposed to middleware, management tools, drivers, plug-in, shims, accelerators, or optimizers.
On the other hand if you are using a storage solution that supports multi-tenant capabilities such as those from EMC, HP 3PAR, IBM or NetApp (MultiStore), you may or may not be using a storage hypervisor. If your view of storage hypervisors and storage virtualization is based on hardware that can support third party storage arrays or JBOD shelves then solutions from EMC (VMAX and ATMOS), HDS, IBM and NetApp among others are storage hypervisors given their virtualization abstraction capabilities?
On the other hand, if you are using tin wrapped software on an appliance-based technology that abstracts underlying storage, you might be using a storage hypervisor. Tin wrapped software refers to software that is architecturally independent of underlying hardware and that is made available as an appliance on a physical server with storage. One of the value benefits is that for organizations who have a different set of purchasing or acquisition requirements for software vs. hardware, the software as an appliance may count as hardware. Another benefit of tin wrapped software is the pre-configuration, setup and somebody else does testing or systems integration.
Some tin wrapped software or appliances can architecturally support different vendors underlying storage technology, however for sales and marketing purposes may be limited to single vendor’s solution while others can be more open. Storage options can include vendor specific, or third party storage systems or JBOD attached via iSCSI, SAS, SCSI_FCP (e.g. Fibre Channel or FC), FCoE, InfiniBand SRP (SCSI Remote Protocol) or NFS and CIFS NAS. There are many examples of tin wrapped software more commonly known as appliances including Actifio, EMC DLm, Vplex, ATMOS and Recoverpoint, FalconStor, Inmage, and NetApp StorageGrid among others. Let us not forget about physical server or virtual machine based volume managers that can also virtualize or abstract underlying storage resources.
What if you are using a virtual storage array (VSA) like VMware vSphere Storage Appliance, or HP (LeftHand), NetApp Edge (Ontap) among many others? Alternatively, how about storage systems that also support virtual machines such as Pivot3, Simplivity or Nutanix among others, then you to may be using a storage hypervisor.
How about software defined storage (SDS) the up and coming buzzword bandwagon description to play off software-defined networking (SDN), would those solutions or services be a storage hypervisor?
Granted storage hypervisor is a newer, trendier, cooler name and term for what some might see as old, tired and boring (e.g. storage virtualization or virtual storage). On the other hand, if storage hypervisor is not new enough, then feel free to jump on the software defined storage (SDS) bandwagon. In other words, what some are calling storage hypervisors have been called storage virtualization (figure 1), virtual storage, and management tools among others by others for several years if not decades now.
Storage virtualization, virtual storage and storage hypervisors for that matter similar to server hypervisors and virtualization can be implemented in different ways, in various locations. They also provide abstraction of physical resources along with emulation as basic tenants that can then be used for enabling aggregation (consolidation or pooling), agility and flexibility, BC and DR, replication, snapshots and other data services.
Underlying storage based on specific vendor implementation might be heterogeneous (any vendors products) including JBOD or RAID based systems, or homogenous (specific vendors’ products). Physical storage can be internal dedicated direct attached (DAS), external dedicated or shared DAS, along with shared iSCSI, SAS, FC, FCoE or IBA SRP for block based, or NAS file or object accessed with SSD, HDD or in some cases PCIe flash cards, tape and cloud services such as Amazon S3 or Glacier, HP, Rackspace among others.
Functionality can also vary with some solutions being very thin or light relying on leveraging underlying storage systems functionality such as EMC Vplex or they can be rich and deep essentially being able to eliminate or give same capabilities of a traditional storage system (e.g. Actifio, Falconstor, IBM SVC and Datacore among others).
In addition to being implemented in different locations, storage virtualization, or storage hypervisors can be in-band, out of band, or fast path control path. In-band architectures are as their name implies, the solution (e.g. software or appliance or storage system) sit between the applications servers and the underlying storage in the data stream. Most solutions are in-band based with some fast path control path. In-band sits in the data path adding some overhead in exchange for extra functionality. Some vendors might claim that there is no overhead or added latency or performance impact, however I tend to believe those who say or claim minimal or no perceived impact.
Out of band as its name implies relies on an additional appliance and or drivers and agents installed on application servers. Another variation is fast path control path that provides a balance between minimizing impact to the data stream getting involved only when needed, then getting out of the way when not needed. In-band based architectures are more common as they are relatively easier to implement, where fast-path control path have been slower to evolve given dependencies on switches and other technology. Fast-path control path solutions tend to be more focused on adding value and leveraging underlying storage systems capabilities such as EMC Vplex vs. in-band systems that can be used to replace storage systems.
Keep in mind, feature, functionality, and vendor specific capabilities aside and preferences aside, the basic fundamental ability of virtualization is around abstraction and emulation. From those two basic capabilities are built the ability to combine, add more feature and functionalities, and be deployed in different places for various reasons (consolidation, mask hardware complexities, flexibility, interoperability, improved hardware use). In other words, the basic and fundamental capabilities often discussed as features or benefits of virtualization (server, storage, networking, IO and desktop) tie back abstraction and emulation.
Both server and storage virtualization are moving into life beyond consolidation where the focus expands from consolidation and pooling, to enabling flexibility for different benefits. In the case of storage virtualization a decade ago, a primary value proposition of proponents was around LUN or volume pooling or using commodity hardware. While there have been customer success stories, many have found the increased flexibility of using an abstraction (e.g. virtualization or storage hypervisor) layer across homogeneous (same vendor or make and model) or heterogeneous (different vendors makes and models) storage systems.
This gets back to what was mentioned earlier, if you view of a storage hypervisor or storage virtualization solution is software based then will probably gauge success based on those solutions. On the other hand, if your view of storage virtualization and storage hypervisors is more pragmatic to include homogeneous solutions using either hardware or software, then you will see a different set of success stories.
Here is my point, there is a lot of virtual storage marketing hype around what is or is not a storage hypervisor that is similar to what has occurred in the past for what is or is not storage virtualization. Consequently, depending on your preferences, sphere of influence, or who you listen to or believe, sell or promote what is or is not a storage hypervisor will vary the same as what is or is not storage virtualization.
Now if you are a vendor, var, pundit, surrogate or anybody else who is not feeling comfortable right about now, relax for a moment. Take a deep breath, count to some number (you decide). Then continue reading before you dispatch your truth squads to set me straight for raining on your software defined virtual storage hypervisor parade ;).
There are many forms of storage virtualization, including aggregation or pooling, emulation, and abstraction of different tiers of physical storage providing transparency of physical resources. Storage virtualization can be found in different locations (figure 1) such as server software bases, network, or fabric, using appliances, routers, or blades, with software in switches or switching directors. Storage virtualization (excuse me, storage hypervisor) functionality can also be found running as software on application servers (physical machines or virtual machines) or operating systems, in network based appliances, switchers, or routers, as well as in storage systems.
Various storage virtualization services are implemented in different locations to support various tasks. Storage virtualization functionality includes pooling or aggregation for both block and file-based storage, virtual tape libraries for coexistence and interoperability with existing IT hardware and software resources, global or virtual file systems, transparent data migration of data for technology upgrades, maintenance, and support for high availability, business continuance, and disaster recovery.
Storage virtualization functionalities include:
- Pooling or aggregation of storage capacity
- Transparency or abstraction of underlying technologies
- Agility or flexibility for load balancing and storage tiering
- Automated data movement or migration for upgrades or consolidation
- Heterogeneous snapshots and replication on a local or wide area basis
- Thin and dynamic provisioning across storage tiers
- And many others
Aggregation and pooling for consolidation of LUNs, file systems, and volume pooling and associated management are intended to increase capacity use and investment protection, including supporting heterogeneous data management across different tiers, categories, and price bands of storage from various vendors. Given the focus on consolidation of storage and other IT resources along with continued technology maturity, more aggregation and pooling solutions can be expected to be deployed as storage virtualization matures.
While aggregation and pooling are growing in popularity in terms of deployment, most current storage virtualization solutions are forms of abstraction. Abstraction and technology transparency include device emulation, interoperability, coexistence, backward compatibility, transitioning to new technology with transparent data movement, and migration and supporting HA, BC, and DR. Some other types of virtualization in the form of abstraction and transparency include heterogeneous data replication or mirroring (local and remote), snapshots, backup, data archiving, security, compliance, and application awareness.
This is not to say that there are not business cases for pooling or aggregating storage, rather that there are other areas where storage virtualization techniques and solutions can be applied. This is not that different from server virtualization expanding from a just around consolidation. The next wave (we are in it now) for server, storage and other forms of virtualization is life beyond consolidation where focus expands enablement and agility in addition to aggregation.
The best type of storage virtualization and the best place to have the functionality will depend on your preferences. The best solution and approach are the ones that enable flexibility, agility, and resiliency to coexist with or complement your environment and adapt to your needs. Your answer might be one that combines multiple approaches, as long as the solution that works for you and not the other way around.
How about volume managers and global name spaces or file systems? Again, IMHO yes, after all a common form of storage virtualization is volume managers that abstract physical storage from applications and file systems. This is about where some become as relaxed as a long tail cat next to a rocking chair (Google it if you do not know) when mentioning virtualization let alone storage hypervisor and volume managers, global name space or global file systems. In addition to providing abstraction of different types, categories, and vendors’ storage technologies, volume managers can also be used to support aggregation, performance optimization, and most other functions commonly found in storage or virtual storage appliances or systems. For example, volume managers can aggregate multiple types of storage into a single large logical volume group that is subdivided into smaller logical volumes for file systems.
In addition to aggregating physical storage, volume managers can do RAID mirroring or striping for availability and performance. Volume managers also give a layer of abstraction to allow different types of physical storage to be added and removed for maintenance and upgrades without impacting applications or file systems. Common management functions supported by volume managers include storage allocation, provisioning, and data protection operations, such as snapshots and replication all of which vary by specific vendor implementation. File systems, including clustered and distributed systems, can be built on top of or in conjunction with volume managers to support scaling of performance, availability, and capacity.
What about Open Stack swift or other cloud and object storage software, can they qualify as being storage hypervisors? My guess is that if you are looking at things from the perspective of server virtualization, or a VMware or Microsoft Hyper-V, KVM or Xen viewpoint, the cloud and object solutions might not quality for storage hypervisor hype status. However keep in mind that most of these solutions support various vendors underlying hardware (software, servers, storage) while providing an abstraction layer, adding capabilities or complimenting those enabled by the underlying technology. Thus IMHO given what some have stretched to be called or included as storage hypervisors, virtual storage or storage virtualization, sure, the cloud and object solutions easily can qualify. Examples include Open Stack Swift, Bash Riak CS, EMC ATMOS, which is now available as a VSA, as well as supporting external third-party storage, Ceph and Cleversafe among many others based on their ability to abstract, emulate and enable added services.
How about gateways including disk and tape libraries that can support and virtualize different types of local and remote cloud services storage while emulating different devices? Sure. If they are providing some emulation, abstraction and more capabilities, either running on a dedicated server (as an appliance or tin wrapped software), or in a virtual machine or as a VSA, why not. For example the Amazon AWS Cloud gateway, or those from Zadara, EMC Cloud Tiering Appliance, Gladient, Avere, Microsoft Storsimple, Nasuni, or Twinstrata among others, not to mention VTLs and data protection appliances from Actifio, EMC, IBM, HP, Falconstor, Fujitsu and Quantum among others.
What this all means
Keep in mind that there are many different types of storage devices, systems, and solutions for addressing various needs, with varying performance, availability, capacity, energy, and economic characteristics. Likewise different tools, technicalities and approaches for abstracting storage resources for various purposes from emulation to agility to consolidation to interoperability among others.
Rest assured there is plenty of hype around storage hypervisors including what is or is not along with storage virtualization and virtual storage. However, there is also plenty of reality in and around storage hypervisors, virtual storage and storage virtualization all of which can be used for different tasks, implemented in various ways with diverse feature, functionality and architectures.
Storage virtualization (and storage hypervisor) considerations include:
- What are the various application requirements and needs?
- Will it be used for consolidation or facilitating IT resource management?
- What other technologies are in place or planned?
- What are the scaling (performance, capacity, availability) needs?
- Will the point of vendor lock-in be shifting or costs increasing?
- What are some alternative and applicable approaches?
- How will a solution scale with stability (performance, availability, capacity)?
- What is the management tools, along with plug-in and feeds for other tools?
- What are the hardware and software dependencies, service and support?
- Who is called and will answer the phone when something breaks?
- What APIs and interfaces are supported (VAAI, VASA, VADP, ODX, CDMI)
- Are any drivers or other software required on servers accessing the storage?
- What does the vendor interoperability certified tested support matrix look like?
The above is not an exhaustive list, rather things to think about and consider which will vary based on your needs, preferences and wants. Take a step back and understand what it is that you need (requirements) vs. what you want (like to have or preferences) to meet different scenarios. Some of the tools and technologies mentioned among their other peers have the flexibility to be used in different ways, however just because something can be stretched or configured to do a task may not mean it is the best for a given situation. In addition, when it comes to storage hypervisors, virtual storage and storage virtualization, look for solutions that work for you, vs. you having to work for them. This means that they should remove or mask complexity instead of adding more layers to manage and take care of.
Storage Hypervisor Wrap-up
In the end, there is virtual storage, storage virtualization, storage hypervisors, hyped storage, along with the growing trend to software define everything as part of playing buzzword bingo.
Wonder if 2013 will be the year of software defined virtual storage hypervisor with cloud (public, private and hybrid) multi-tenant capabilities. In the meantime, welcome to the wide world of storage hypervisors, virtual storage, storage virtualization and related topics, themes, technologies and services for cloud, virtual and data storage networking environments.
Ok, nuff said (for now).
Great distillation of buzzword bingo with a focus on what are all these storage innovation are accomplishing: remove complexity at different tiers.
You did forgot one technology: NFSv4.1, pNFS and (future) FedNFS.
One goal of these protocol enhancements is to reduce the complexity of the storage namespace. The result is a storage access protocol that has greater independence from the underlying filesystem and provides flexibility.
Thanks for the notes and comments, yes I did leave NFSv4.1, pNFS (has been declared the future for how many years now 😉 ) along with a whole bunch of others protocols, interfaces, personalaties, tools and technologies that perhaps will get covered in a future post here. Otherwise, they have and do get mentioned in some my other content elsehwere ;)…
From my perspective reduce as many components as you can.
Move from a 3 tier solution(X86 servers, network , storage) to a 2 tier solution(X86 servers and network). If I look back till 1998 as vmware started with virtualization, in the first place a X86 server with local storage was used. Then when VMotion came up storage come up, so the harddisk of server was moved out to box(storage system). Then the storage abstratction layer come up(called storage virtualization or as you mentioned outher names) to stretch the disk over many location. Now in 2013 the disk is going back in the servers and a storage virtualization is abstraction all disk from all X86 in a virtual san or virtual nas (nutanix). This means the disk is back in the server, so has short path, also lower latency, which means performance win from a physics standpoint. So now we are back in 2000’s were we had in physical environment only to main physical components. X86 servers and network. From my point I only need a nutanix solition and network switch which can be deployed in 2 hours. A network switch should come 1000 perconfigured vnetworks(vlans, fully tested). Then every 2-3 years you should kick out the old enviroment from moores law.
With these definitions, at Starboard Storage starboardstorage.com we use a storage hypervisor for pooling disk for capacity, pooling SSD for performance and virtualizing the provisioning of SAN volumes and NAS shares for workload consolidation. youtube.com/watch?v=82xUwXjZEGc