Data storage hardware isn’t perfect. Unintended losses and errors can make it impossible to access vital information and cripple mission-critical applications. Using a RAID, or redundant array of independent disks, may help data-dependent organizations maintain high availability and improve performance.
A Brief Introduction to RAID
RAID configurations create distributed data storage across multiple disks. The earliest implementations were devised at academic institutions in the late 1980s to demonstrate that standard personal computer disks could hold their own against high-end enterprise storage. Today, they minimize load on individual disks, reduce access latency and even provide fast recovery following hardware failures.
How Does RAID Work?
To a computer or network, a group of hard drives in a RAID configuration appears to be a single device, known as a logical unit number, or LUN. LUNs employ various hardware or software techniques, such as:
Parity is a form of error checking or data redundancy. In RAID, it typically consists of performing logical operations on two drives to generate parity data that’s stored on a third device. Suppose you add the contents of one drive to those of a second and store the results on a third. If drive one or two goes down, you can restore it by subtracting the contents of the remaining good drive from drive three. While this is just an analogy because parity uses logical XOR operations instead of actual binary addition, it illustrates how the concept can be extended to any number of grouped drives.
Simply put, RAID mirroring consists of replicating one set of data on another identical device. As you write information to the main drive, another copy is written to the mirror, which enables recovery should the main drive fail. Although it naturally increases the amount of time required to write data, mirroring lets you access data faster than one disk can retrieve it by getting parts from each device at top speeds.
Disk striping takes a chunk of ordered data and splits it up into sections that are then stored on different drives in the array. By writing consecutive sections on different disks, you gain the ability to increase file access speeds by retrieving from each disk simultaneously. This method is not failure proof as it doesn’t include backup data, but it can be used in conjunction with other methods.
What Are RAID levels?
RAID schemes take numerous forms referred to as types or levels. While there are many to choose from, most disk arrays use one or more of the following standards:
RAID Level 0 (Stripe set)
This raid level uses disk striping to spread information evenly over all the disks in an array. It needs two or more disks, and its main purpose is to provide heightened performance as economically as possible. Without redundancy, however, the failure of any single disk will result in the entire volume being lost.
RAID Level 1 (Mirror)
RAID 1 is used for redundancy and performance. Arrays consist of at least two disks with one serving as a mirror. In addition to making it easier to recover information, RAID 1 may reduce seek times by simply pulling specific data from whichever disk is faster or closer to the right position to access the information in question.
This RAID level is no longer commonly used. It employs striping at the bit level in conjunction with special error-correcting parity schemes called Hamming Codes to preserve data across synchronized disks.
RAID 3, which is rarely used, stripes data byte-by-byte and includes a dedicated volume for storing parity data. As with RAID 2, disk rotation is synchronized. Its inability to keep up with numerous small requests means it’s most commonly used in media servers and similar setups where data is accessed in long chunks.
In RAID 4, striping occurs at the block level, and the array includes a volume dedicated to parity info. Its main advantage over RAID 2 and RAID 3 is that it facilitates small transfers since you can perform operations without accessing every single disk.
RAID Level 5 (Stripe with parity)
In RAID 5, striping occurs at the block level, but parity data is striped along with the disks. The most prevalent RAID configuration, this scheme uses three or more disks, and it can recover from the failure of any single disk at a time.
RAID Level 6 (Stripe with dual parity)
This configuration combines block-level striping with two separate parity stripes. As such, it can be used to recover from the failure of up to two drives, and it overcomes many of the problems associated with RAID 5 in larger disk arrays.