Hard Disk Drives (HDD) for virtual environments (Part II) how drives differ

By Greg Schulz, Server and StorageIO @storageio

In part I of this series we looked at basic Hard Disk Drive (HDD) characteristics and wrapped up with the question of what is the best type of HDD to use?

I often get asked why there needs to be different types or tiers of data storage devices including HDD and Solid State Devices (SSDs), along with interfaces, why not just one or a few?

The answer is similar to why there are different types or tiers of servers or networks along with multiple virtualization or storage hypervisors. In other words there are different needs and requirements and thus various types of drives and interfaces, not one size fits all.

Like a book or many other things, do not judge a HDD simply by its cover-meaning look beyond the interface, cost, capacity and form factor. This means is that there is a lot more to an HDD than its packaging size, interface (e.g. SAS, SATA, FC or USB) and space capacity.

Storage tiering is used in the context of different things from functionality for moving data around, to price bands or cost of storage systems, application usage, vendors and storage media or mediums. Our focus here is on the storage media or mediums specifically HDDs.

The industry via consolidation, mergers and acquisitions is down to three HDD manufactures, which are Seagate (STX), Toshiba, and Western Digital (WD). Seagate and WD have the vast majority of market share. Toshiba (www.storage.toshiba.com) acquired the HDD business of Fujitsu. WD (www.wdc.com) bought the HDD business of Hitachi (e.g. HGST) not to be confused with Hitachi Data Systems (HDS) that sells Hitachi manufactured storage systems. Seagate (www.seagate.com) also acquired the HDD business from Samsung in addition to other previous acquisitions including Maxtor. Depending on when you read this either Seagate or WD may be the current market leader in different categories given their size and portfolios.

In general, HDDs (along with SSDs and HHDDs) are classified into four main groups that will vary by the manufactures and their OEMs or partners.

General HDD classifications for internal and external include:

Enterprise (usually considered internal, also for cloud or service providers)
Desktop, notebook, workstation and or laptop (both internal and external)
Consumer (can be internal or external included in appliances or other solutions)
Specialized (for video, NAS, DVRs and other specific applications)

Looking beyond the cover of HDDs

As mentioned earlier, it can be easy to look at the cover of something to make comparisons and HDDs are no exception. On the surface, they can look similar when looking just at cost or capacity and form factor or interfaces. What are the difference between enterprise, desktop, notebook, consumer and specialized HDDs besides price? Looking under the covers or digging deeper into the various drives certainly there are space capacity differences, interface type and speed, power consumption, reliability, warranties, performance (beyond interface speeds) and feature functionality among others. Another difference is that enterprise class HDDs are designed to be in continuous operation, which means 8,760 hours per year (24 hours x 365 days).

Alternatively desktop and note book drives are designed for start, stop and lower total power on time per year, for example 2,400 vs. 8,760 for enterprise class. Likewise, a desktop or notebook drive is designed for more loads/unload operations (tied to power-up, parking, un-parking read write heads and other tasks) vs. an enterprise drive.

HDD attributes that vary by tier, class, category, make and model include:

Availability and reliability
Cache (DRAM and nand flash)
Cost
Form factor (physical attributes)
Functionality and features
Interface type and speed
Performance of the device
Power or energy
Security
Space capacity
Warranties

Availability and reliability

The historical metric for determining HDD reliability has been Mean Time Between Failure (MTBF) on an individual device or pool of drives basis. MTBFs vary from hundreds of thousands of hours on notebook and desktop types of devices to 1.6 million hours for some enterprise class drives. If you are using a desktop drive designed with a lower annual power on hour (POH) in a continuous mode and realizing higher than expected failures, you may be using the wrong technology. Likewise, if you have enterprise drives and are constantly powering them on, off and they start to fail, you might be using the wrong technology.

Keeping in mind that enterprise drives are designed to be in continuous use vs. starting and stopping, their total number of POH per year should be higher. For enterprise drives that are in constant use POH would be 8,760 (24 x 365), for a desktop drive the POH might 2,400 given start stop activity. With increasing size or number of drives being deployed, the population size also factors into MTBF and a newer metric called Annual Failure Rate (AFR). AFR = 1 – EXP ( – Annual Operating Hours / MTBF). You can find MTBF and AFR numbers on vendor websites and their data sheets.

Another reliability and data integrity metric for HDDs is the number of non-recoverable read errors per bits read which for higher end enterprise devices can be 1 in 10 ^ 16 or a few decimal places lower for desktop and consumer devices.

While we talking about reliability, a common conversation has to do with long drive rebuilds with higher capacity devices. If you are experiencing, or worried about long drive rebuilds due to device failures, take a step back and revisit what type of drive you are using. Is the drive actually failing or is it being proactively marked as going bad, or a false condition due to storage system, appliance, adapter or other software functionality. Thus, if concerned about drive failures and their subsequent rebuild times, look at using drives that are more reliable, along with storage systems and adapters that can better manage the devices, along with faster rebuild options.

Cache

Amount of DRAM cache that is usually used for read, however with some can be configured as a volatile write cache. Note that since it is volatile and thus introduces a potential data integrity issue, the space is usually used for a read and read-ahead buffer. Capacities will vary based on make; model, OEM and price point however can include 128MB, 64MB, 32MB, 16MB and 8MB. In addition to DRAM cache, some HDDs also have added nand flash (SLC, MLC, eMLC) aka SSD integrated into the actual drive. Native HDDs are different from fusion type or compound packaged solutions that include separate SSD and HDD with a controller or management software. An example of an integrated Hybrid Hard Disk Drives (HHDD) is the Seagate Momentus XTs that have 4GB or 8GB (depends on model) of SLC nand flash. Read more about my experiences with HHDDs here along with some performance comparisons here. Since the HHDDs boot faster and have better read performance, I place some of my VMs on them and tell ESX that they are SSDs.

Cost

Let us be candid, cost is usually a driving factor when it comes to storage media from HDDs to SSDs with the usual metric of how much capacity for a given price. For some applications or usage the cost per capacity can be valid however where performance, or availability or other functions are needed also require different comparisons. For performance, what is the cost per IOP or bandwidth or a given level of response time or availability? As mentioned above, availability and reliability will vary with different types of drives, along with where you buy the drives. For example notebook and desktop drives in a given form factor, capacity and interface may be lower cost than an enterprise device.

Enterprise devices are designed for continuous operation vs. start stop and vice versa, things to factor into the price. Also regarding costs, there are many different online venues from Amazon to Best Buy among others where you can buy various drives. Can you only buy desktop, notebook or consumer drives from venues such as Amazon? Turns out either via Amazon or one of their partner affiliates you can buy enterprise drives along with others, however pay attention to the firmware and if support for your particular system or needs. Some storage systems or server vendors support third party drives while others frown on doing so even if from the same manufacture, make and model.

There are also other venues where you might find the same drive with the same specs at a lower cost, perhaps even a deal too good to be true. Pay attention to if those are new or refurbished drives, what are the warranties and look closely at the specs to make sure they are what you need, or firmware is available. Too say that you can get what you pay for might be too easy, however it can be true, likewise there are some good bargains if you pay close attention to what it is that you are getting.

This wraps up this post; up next part III, we continue our look beyond the covers to determine the differences and what HDD is best for your virtual or other data storage needs.

Ok, nuff said (for now).

Cheers gs

2 replies on “Hard Disk Drives (HDD) for virtual environments (Part II) how drives differ”

Pingback: Hard Disk Drives (HDD) for virtual environments (Part II) how drives differ | Home Based Business
Pingback: Some things keep going around, Seagate ships 2 Billion HDD's | StorageIOblog

Comments are closed.