What do you do if you need very fast data processing and don’t want to…
By Mike Cobb, Director of Engineering
All drives fail. Even in the cloud.
In a cloud-based data storage business, there is no actual magic cloud. Storage drives are housed in data centers where technicians maintain and protect copies of important customer information for a fee.
For the past four years, Backblaze, a U.S.-based cloud data storage company, has conducted comprehensive quarterly studies regarding reliability of the hard drives that store their customers’ data. These quarterly reports have become well known in technology circles, and techies worldwide eagerly study each edition the moment it’s released. Last week, Backblaze published its latest report, the Hard Drive Stats for Q3 2017.
To collect these stats, Backblaze snapshots the status of all the drives in their data centers each day. The drives in the study consisted of different makes and models of single hard disk drives (HDDs). Together, these drives provided 400 petabytes of storage. The drives, sourced from multiple manufacturers, ranged from 3 to 12 Terabytes in capacity. No solid state drives (SSDs) were included in this study.
Summary of Results
Some of the drives in use at Backblaze are much older than others. While the average age is under three years, the oldest have been in use more than four. Across the board, results showed an average annualized failure rate of 2.07 percent. This is a slight increase over the 1.97 percent reported in the previous quarter.
There was little difference in failure rates between “enterprise” drives (mostly used by companies) and “consumer” drives in the models that were included in the test. Specific causes of failure were not listed.
Since the third quarter report from Backblaze, the storage company says it has added 9,599 new hard drives and retired 6,221 hard drives. That represents a net increase of 3,378 drives and a total number of drives that moved up to 86,529.
Pods, Tomes and Vaults: Science Fiction or Smart System?
Backblaze entered the data storage business ten years ago in 2007. Today, the company offers “cloud-based” storage solutions for consumers, businesses and other storage solution providers.
To make sure no customer data is ever lost due to drive failure (after all, all drives fail), Backblaze has developed an extremely intelligent system designed to protect customer data not just from dying drives, but from server, power and many other types of failure. They claim that their storage concept is “simple.” But it’s not.
Here’s how the Backblaze system works.
The company starts with a unit of measurement it calls a storage pod. A storage pod is a grouping of sixty hard drives. A Backblaze Vault consists of twenty storage pods logically grouped together into a ginormous 1,200-drive RAID. Drives located in the same position across the twenty storage pods within a vault are called a tome.
A customer file is broken into seventeen pieces, known in Backblaze as shards. Those shards of data are spread out across seventeen of the twenty drives in a single tome. The remaining three drives in that tome store parity shards of the customer data. These twenty drives that compose the tome and each holds a piece of that one file are spread across the twenty storage pods in a vault.
Here’s a handy illustration from Backblaze to help explain:
To simplify, what this means is that one customer file requires twenty drives for storage at Backblaze. With hundreds of thousands of customers relying on Backblaze to safely store backups of their irreplaceable data, it’s not surprising they need tens of thousands of drives.
Benefits of Cloud Storage
At some point, all data storage devices are destined to fail as a result of age, mechanical breakdown or accident. This is true regardless of whether your data is stored on a hard disk drive (HDD), solid state drive (SSD), helium drive or flash drive. The only real question is when drive failure will happen.
DriveSavers recommends a backup strategy like three, two, one: Three copies of your data on two different types of media and at least one copy kept off-site. This means storing one copy of your data on your own computer’s hard drive as your working data, keeping a copy of that same data on another local storage device, like an external drive, and a third copy on another drive kept off-site or saved to a cloud storage solution like Backblaze.
Benefits of storing data in the cloud start with universal access, providing you have a good Internet connection with the host. Physical separation also means your data won’t be affected if your home or business computer system suffers any failure or disaster, like a fire, flood or physical theft.
For example, if you store your data on a cloud storage system, you’ll be able to get to that data from any location that has Internet access. And, if others need access to the same information, you can give a link to anyone you want to have it.
Whatever you do, never rely on any one storage option for keeping irreplaceable data. No matter how advanced a cloud company’s system may be, they are still storing your data on physical devices. Physical devices fail. Unforeseen problems occur.
Do your research into cloud storage and choose an option that works for you, but always keep another copy of your data in another location as well. That may be an external drive in your desk drawer or it may be a second cloud service provider. However you choose to store your backups, just remember three, two, one.