RAID Best Practices

Andrew Rodriguez
Andrew Rodriguez
  • Updated

Document Scope & Audience

Document Scope

This article covers what RAID is, how to integrate it, how to use it, and best practices to maximize your experience. No matter how big or small your organization, including home use, protecting data is important and there are two main ways this can be accomplished. 

  1. Backups

    1. Offloading data to another system.
    2. This addresses total system failure, viruses, corruption, and more.
  1. RAID 

    1. Protects data from drive failure. 
    2. Can increase performance depending on configuration. 

Document Audience

All systems integrators and users to reference for better understanding. 

 

What Exactly is RAID?

Redundant Array of Inexpensive/Independent Disks is a storage structure that unifies two or more disks to be used as one logical device. By spreading out data on multiple disks this can help overcome any one disk failing and/or increasing performance. 

The 3 Basic RAID configurations

  1. Striping (RAID 0)
    1. Data is written across multiple drives, which minimizes read, write, and access times since more than one disk can perform the request. In turn, this increases I/O performance.
  2. Mirroring (RAID 1)
    1. Replicates the same data on two or more drives, which addresses data redundancy and can help prevent data loss should any one drive fail. 
  3. Parity (RAID 5 & 6)
    1. This configuration provides fault tolerance by examining the data stored on two drives and storing the results on a third disk. Should any one disk fail there is enough data on the remaining drives to rebuild the missing disk on the replacement disk. 

NOTE: It is possible to combine features of each of the above into one array called RAID 10, 50, 60. The RAID controller handles the combining of drives into these configurations to maximize performance, capacity, redundancy, and cost to suit the application at hand.

 

Hardware RAID vs. Software RAID

RAID can be obtained via a dedicated hardware controller or by software. Each has its Pros and Cons which we will examine below. 

Hardware RAID

This is accomplished via a dedicated RAID controller, most often a PCIe controller card or in some cases via a RAID-on-Chip (ROC). The RAID controller has its own processor and memory dedicated to the task, which prevents any extra storage load on the system CPU, allowing it to be used for all software requirements, operating system, and applications.

HW Pros:

  • Better performance when compared to software RAID.
  • Controller cards can be swapped for replacements or upgrades.
    • Exception: If a ROC was used, it would require the entire motherboard to be swapped out, which is much more invasive and time consuming.

HW Cons:

  • Having to add hardware increases the cost of a system.

Software RAID

Instead of having a dedicated controller, the storage workload is added to the system CPU and is driven within the OS. 

SW Pros

  • Lower cost since additional hardware is not needed.

SW Cons

  • Lower RAID performance as this workload is added to the CPU, which is already processing the operating system and applications.

How Does RAID Work?

The RAID system combines the individual drives into one logical disk. The OS treats the drive like any other drive present on the system. The OS does not detect the difference between a single disk, or a RAID being presented by the controller. There are some differences between HW and SW mechanics highlighted below.

HW Mechanics

The RAID card is either directly attached to the disks destined for the RAID or connected to a hot-swap backplane, allowing the dedicated resources to manage the disks and present them as a logical volume to the OS. 

SW Mechanics

An application on the host that loads with the OS and will present the selected drives attached to the system as the logical volume, which will occur once the system is booted enough to engage the driver software.

Given the server hardware being used today, it is always best to use a dedicated hardware RAID controller to take advantage of the increase in performance and the flexibility it provides. 

Hybrid RAID

The typical RAID is built with spinning disks but could consist of all SSD for better performance. However, the latter does come with a significant price tag associated with it. One clever solution is a hybrid RAID that uses both spinning disks and SSDs to achieve better performance while keeping the cost lower than an all-SSD alternative. Essentially, write operations are performed to both the HDDs and the SSDs but the read operations are performed by the SSDs exclusively. This allows each server to increase IOPs and reduce latency, which allows the server to host more users and perform more transactions, which in turn reduces the number of servers needed to support any given workload. 

This can be used for simple mirrors in workstations all the way up to data center applications allowing for greater capacity in servers and faster booting of those systems. If this is of interest, further research in greater detail may be required.

Who Should Use RAID?

If the system in question requires constant uptime, RAID would make sense. With the technology being so readily available nowadays, it is suggested for any system to contain a RAID for the crucial data mounts. At the very minimum, any critical data and the OS should be on a RAID to allow the users and system admins to sleep easier at night. 

Even if a system not using RAID is being regularly backed up, there is still risk of some lost data when a drive fails. More importantly, the time to fix the issue will always be greater than if the disk failed when in a RAID.

When RAID is in place, a failed disk can most often be hot swapped into the system and the controller will then copy the missing information to it with little to no impact to the user with a much smaller investment of time than restoring from backups. 

Ideally, implementing a RAID and then making regular backups of that RAID to an off-system source will cover any scenario the system will eventually encounter. 

Choosing the Correct RAID Level

As mentioned above there are several different configurations of RAID that can be set up on a system (RAID 0, RAID 1, RAID 5, & RAID 6). They each have their own pros and cons. Furthermore, hybrid configurations can combine multiple simpler RAIDs into one working volume (RAID 10, RAID 50, & RAID 60). Each RAID has huge differences to the next, and when and where to use each is very important to consider. 

The factors to considering when picking your RAID include:

  • Capacity
  • Performance
  • Redundancy
  • Price

Unfortunately, there is not a single one-size-fits all configuration as to increase any attribute often comes at the cost of affecting another in the list above. For example, a RAID that focuses on performance often does such at the cost of redundancy. A large, fast, highly redundant array will be EXPENSIVE. On the opposite side of that, a small, average speed array will significantly cost less, but it will not be as fast as the previous example. 

Below are the different RAID configurations in more detail.

RAID Level Common Name Description Pros Cons
RAID 0 Striping The simplest RAID configuration to understand is RAID 0, essentially all drives are combined into one massive logical volume to present to the system. This is great for performance as there are 2 or more drives sharing the work of the volume allowing more heads and spindles to write or read the workload at hand. The tradeoff here is there is absolutely no redundancy and if a single disk in this configuration is lost, the entire volume is lost even on the disks not experiencing the error. Since there is a great risk for total data lost, coupled with the fact that SSDs are quite affordable now RAID 0 is NOT RECCOMENDED. The threat of losing all data on the volume outweighs any performance gains it might provide. 
  • Fast and inexpensive.
  • All drive capacity is used.
  • Quick to setup.
  • All drives sharing the data load makes it the fastest of all arrays.
  • NO DATA PROTECTION AT ALL.
  • If one drive fails all data is lost with NO POSSIBILITY of recovery.
RAID 1 Mirror This configuration shares the data across multiple drives. Whether it be 2 or 100 disks, all disks contain the same data while still being presented as one. This configuration is all about protection and not performance or capacity.
  • HIGHLY REDUNDANT - Each drive is an exact copy of the others in this RAID. 
  • If a drive fails, there is no loss in system performance at all unless it is the last drive in the config.
  • Performance is not much better than a single drive. 
RAID 1E Striped Mirror This configuration combines striping and mirroring in one array. 
  • Redundancy with better performance. Can be thought of as a mirror with an odd number of drives.
  •  High cost since you are only really getting to use 50% of the available drive space. 
RAID 5 Striped w/ Parity Typically referred to as the best "all-around" RAID configuration, RAID 5 stripes data blocks and parity across all the available drives. 3 are needed at minimum but can go all the way up to 32. Should a drive fail, the parts needed for that drive are present on all the remaining drives, allowing for the replacement disk to easily be filled with the data the original drive contained. Performance is rather close to RAID 0 but there are more operations to be completed since the data needs to be written in combination of the needed parity. 
  • Good value and "all-around" performance.
  • Capacity is (total drive size - one disk size)
  • Only one drive can fail at any one time before data loss occurs.
  • Should 2 drives fail, all data is lost. 
RAID 6 Striped w/ Dual Parity Like RAID 5 in design and performance; however, parity is written in two places. This allows for 2 drives to fail before the total loss of data. This extra security comes at the price of another disk being lost to parity. Minimum drives needed is 4 and maximum is 32.
  • Pretty good value for the required money investment.
  • Can withstand 2 drives failing. 
  • More expensive than RAID 5 due to another drive being used for parity.
  • Slightly slower than RAID 5 in most applications.
RAID 10 Striping w/ Mirroring Can also be called RAID 1+0, consists of multiple paired mirrors being striped together into one logical volume. This option offers good performance and data protection while eliminating parity calculations. This configuration can be setup on any number of even drives and can be expanded by adding drives in pairs/mirrors referred to as legs. For example, this can be set up on 10 drives consisting of 5 pairs and offering 5 drives worth of storage. 
  • Fast and redundant.
  • Expensive since it requires at least 4 drives but only offers the storage capacity of 2.
  • Not the best for large capacities because of the related cost.
  • Not as fast as RAID 5 in most streaming environments.
RAID 50 Striping w/ Parity Also referred to as RAID 5+0 combines multiple RAID 5 sets into a RAID 0. This allows for larger volumes to be created and each RAID 5 subset can withstand a drive failure before data loss occurs. This config also has faster rebuild times when compared to a traditional RAID 5 array. Although it can be created on as little as 6 drives, it really should be used on a minimum of 16 drives. Usable space depends on the number of drives in the array, anywhere between 67-94%. For example, if 24 drives were in the array it could be 2 legs of 12 drives, each with only 2 drives being used for parity OR it could be three legs of 8 drives, each with 3 drives used for parity. The latter option offers less overall storage but would be faster as a rebuild would only affect the drives on that leg. 
  • Reasonable value for the cost.
  • Very good all-around performance, especially for streaming and large storage capacities.
 
RAID 60 Striping w/ Dual Parity Also referred to as RAID 6+0 combines multiple RAID 6 sets into a RAID 0. Very similar to the option above but offering 2 disks of parity per leg allowing for 2 drives to fail before data loss occurs. A minimum of 8 drives are needed but 16 is recommended. Usable space is between 50-88%, depending on how the legs are configured. For example, with 36 drives, a RIAD with 2 legs of 18 drives each could be used or a RAID with 3 legs of 12 drives each.
  • Can sustain 2 drive failures per leg.
  • Very large and relatively good value since it is only used for large configurations.
  • Needs LOTS of drives.
  • More expensive than RAID 50 due to the extra level of parity.

 

When to Use Which RAID Level

Data can be roughly categorized into two classifications--Streaming and Random--while the classifications of RAID can also be placed into two fields--Non-Parity (RAID 1 & 10)--while the rest (RAID 5, 6, 50, 60) can be labeled Parity.

Random data loads tend to be small in nature, while Streaming loads are often larger. Although all systems will experience loads of both nature it is best to design for the average workload a system will experience. If the system will handle a lot of Random loads, Non-Parity is a better option. For systems that will be used for Streaming, Parity configurations are a better choice. 

Drive Size Performance

Even though HDDs are getting larger, their speed has not really changed for some time. For example, two disks at different capacities (1TB and 6TB) offer vastly different storage space but they spin at the same speed. With that said, a larger platter will take longer to find the information needed, as well as directly affect rebuild times since the entire platter needs to be written to even if the space is empty. This in turn directly affects rebuild times. The general rule of thumb is to create a RAID with more drives than achieving the same amount of storage with less physical disks. For example, a RAID with three 6TB drives will be outperformed by a RAID that has six 3TB drives since the workload is spread out across more heads and platters. 

This is not necessarily true for SSDs as bigger devices are typically faster than smaller ones. This means that SSDs need to be reviewed with extra scrutiny to ensure the specs of the purchased equipment meet your needs. Additionally, since SSDs are completely different than the spinning counterparts, the general rule of thumb for SSD RAID is to achieve the storage level you need with as minimal of devices as possible. The bigger disks have greater throughput than the smaller options and will be seen in system performance. 

Size of Array vs. Size of Drives

Something that is often overlooked when creating RAIDs is that the entire disk space is not needed to create the RAID. Often when configuring the RAID, a portion of the drive can be used, leaving the remaining space to be used in another array if needed. For example, a RAID 10 can be spread across all the drives using a small portion of each that will ultimately host the OS. This then leaves most of the disk space left untouched that can then be used in a RAID 5 for data or other uses.

Rebuild Times and Large Arrays

As mentioned above, rebuild times can be affected by multiple factors. Unfortunately, the more drives in an array and the bigger the disks, the longer the rebuild times will be whenever a disk needs to be replaced. Even though it was mentioned that a RAID 5 can have up to 32 disks in the array, it becomes rather impractical with spinning media due to the increase in rebuild times. In contrary, a RAID 50 would be a better fit since it could be comprised of two legs of 16 drives each, meaning that when a disk needs to be rebuilt, only 15 good disks and the one faulty would be affected, leaving 16 drives untouched, allowing them to perform as if nothing was going on with the server while also allowing for better rebuild times. 

NOTE: If drives are 6TB+, rebuild times will be greater than 24 hours and that is assuming there is no additional load on the server. If the server is still in use during the rebuild, the time will increase even further. 

In regard to SSDs, the rebuild times will be much faster since they are often much smaller than their spinning counterparts and their throughput is much greater. 

Summary

In the current landscape of computers and servers, RAID is invaluable and should be implemented anyplace that data is important. It will save you no matter how good your hardware is or how careful you are. However, it should be implemented correctly for your needs and used with data backups to minimize any potential issue that may surface. Hence, depending on your needs, system use, and budget, several decisions need to be made. If your budget allows for a hardware controller, it is always a better choice. The next decision is based on use but also how much space is needed and what kind of data is going to be hosted. Again, if the budget allows, SSDs are superior, but if spinning disks are to be used, it is always a better idea to use smaller capacity HDDs to achieve the determined storage goal. This allows for more heads to distribute workload and less space to search on any given drive, allowing for a better distribution of the workload, faster search times and rebuild times. 

The key takeaways are that you should be using both RAID and backups to avoid losing data to make life easier when using computers, as a failure WILL happen and by using both, the chance for data loss and downtime are dramatically reduced, if not eliminated. Furthermore, the CORRECT RAID level for your use case needs to be configured and actively managed. Unfortunately, RAID and backups will not save you from user error and it is always better to measure twice and cut once. 

 

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.