RAID, now known as the Redundant Array of Independent Disks, was previously known as the Redundant Array of Inexpensive Disks. However, hard disk manufacturers probably thought of getting the conception of inexpensive out of the mind of consumers, hence changing the full form to Redundant Array of Independent Disks. Now, let us try to reason as to why was RAID implemented and hence, try to understand the importance of getting RAID levels explained. The term was first coined and defined by David A. Patterson, Garth A. Gibson, and Randy Katz, at the University of California, in 1987. The reason was simple. To combine multiple hard drives to provide high performance.
However, the implementation of RAID was not limited to just that. Different RAID levels were used to deal with different situations. First of all, when hard drives are connected in parallel, data can be accessed at a faster rate. Other situations that were dealt with by the implementation of RAID were hard drive data recovery and taking into notice that, if one of the hard disks failed, the remaining still worked.
Hence, the combination of hard drives into a single logical unit in a computer server, increases the network availability considerably, hence, increasing the overall system's performance. The explanation given in this article, will not only introduce you to the different RAID levels, but also try to help you in their comparison. Performance is another aspect, which will be dealt within this article.
Different RAID Levels Explained
RAID 0 and RAID 1 are the building blocks on which the different levels were further developed. Here, we will try to understand these two levels along with the others; RAID 2, RAID 3, RAID 4, RAID 5, RAID 6, RAID 10, and RAID 01. These are the most common and of course, the standard ones.
In RAID 0, the data is split across different hard drives. Hence, as expected, at the server level, its performance is very high. A good number of terminals can access different bits of information at the same time. However, the basic reason as to why it cannot be termed as a proper RAID level is that redundant information is not stored. Hence, a single disk failure can result in the loss of a good amount of data.
- Since redundant data is not stored in RAID, hence its capacity is excellent.
- It is very good for large data transfers.
- Splitting up of data across various hard drives provides very high input/output rates.
- There is no parity generation.
- Since, copies of data are not created, hence it is very cost-effective. No extra space is used in storing duplicate data.
- It is very easy to implement.
- The single drive MTBF causes the data availability feature to be very low.
- It is not a proper RAID level, since it cannot provide data redundancy.
- A single disk failure can result in a considerable amount of data loss.
- It is not the right choice for critical systems, where data holds the prime importance.
This level is known for its mirroring capability. Two hard disks are used, out of which one stores duplicate data. In other words, same data is stored in both the hard disks. Thus, data redundancy is provided very well in this RAID level. However, the cost of implementing it becomes very high, since one of the hard drives is just used for keeping the duplicate content of the data in the other hard drive.
- The capacity of data storage is not bad. It is 50%.
- For large data transfers, it is also very good.
- In this level, reading of data is quite fast.
- Most importantly, failure of any one of the disks, cannot cause data loss, as a backup is always there in the other hard disk.
- It is easy to implement.
- It is not very cost-effective, because one of the drives is just storing the duplicate data of the other.
- The writing speed is decreased, since data has to be written twice.
- The disk overhead is also very high.
In RAID 2, data is not stripped at blocks, but at the level of bits. Hamming code is used for error correction. Hamming code is a linear error correcting code. This is very efficient in recovering accurate data from the single bit corruption in data. Thus, it provides a very high data transfer rate.
- High data transfer rates.
- Single bit corruption of data can be accurately recovered.
- Multiple bit corruption can also be detected with much ease.
- Multiple bit corruption is possible.
- Multiple bit corruption can be detected but not corrected.
- The error bit correction logic is very complex. It has become almost an obsolete method of data storage.
In RAID 3, data is split at byte level. In this method, one additional hard disk is used for holding the parity bits. Since data is stored and stripped at the byte level, accessing a single block of data requires access to more than one hard disks. Its use is very much limited to certain applications.
- For large file transfers, it provides very high read and write speeds.
- It is quite cost-effective.
- The capacity of the hard disks used in this system is also very good, since, only one extra hard disk is used for storing the parity bits.
- It is not very good for small data transfers.
- Accessing a block of data means, dealing with more than one hard drive in the hard drive array.
- Application is limited only to specific fields.
RAID 4 is quite similar to that of RAID 3. It also uses a dedicated parity disk, but the difference is that, it strips the data at block level. This is another one, which became obsolete very soon.
- It can provide multiple reads if the controller allows it to do so.
- It is also quite cost-effective.
- Unlike RAID 3, it does not require synchronized spindles.
- Since, only one block of data can be accessed at a time, the system does not have a very good performance.
- Writing data to disks is also very slow, as in addition to writing blocks of data, the parity data also needs to entered.
- Its implementation requires solving the problems, which already exist in it, hence making its use almost obsolete.
This is perhaps the most popular RAID level. It also uses block level stripping, but a single dedicated hard drive is not used for holding the parity data. It also provides high storage capacity too.
- High read/write speeds are possible. As against RAID 3 and 4, which were quickly replaced by RAID 5, it allows multiple writes.
- It is very cost-effective. With a minimum of just 3 hard drives, it can be explained.
- Its capacity is also very good.
- It is not very efficient with large data transfers.
- Though the performance is very good, a disk failure can have a significant impact on the system's performance.
RAID 6 was an extension of the RAID 5. Two parity blocks are written in this level, to help in the data recovery process. It was used for preventing data loss, in case of concurrent disk failures.
- Its performance is high for read operations.
- Its capacity, though not as good as RAID 5 or other levels, is moderate and hence not very costly to implement.
- It is good for large data transfers.
- The performance is not very good for small data transfers.
- Writing data requires larger time, as two parity blocks are created.
RAID 10 is often referred to as RAID 1+0. The reason is that it uses the combined features of RAID 1 and RAID 0. Here, a mirror of each block of data is created and data is also stripped. This is a very good system for handling multiple drive failures.
- The disk capacity is moderate, 50%. Here again, a copy of the same block of data is stored.
- It is very much fault tolerant, as it can handle multiple hard drive failures.
- It is very good for large data transfers.
- It is not very cost-effective. Duplicate copies of data demands double the number of hard disks, than what is required.
- Just as in the case of RAID 3, the drive spindles need to be synchronized.
RAID level 01 can also be referred to as RAID 0+1. It also does not cause any parity generation. Many people confuse it with RAID 10. However, here, first the data is stripped in an array of hard drives, while another array holds the mirror image of the first array of data.
- It requires a minimum of 4 drives and is very easy to implement.
- Since mirror images of all the blocks of data are created, the capacity is moderate at 50%.
- For large data transfers, the data transfer rate is quite high.
- Since duplicate copies of the same data are created, it is quite costly.
- The write operation takes comparatively longer time.