What is RAID & it's protection levels?

RAID is a storage virtualization technology which is used to organise multiple drives into various arrangments to meet certain goals like redundancy, speed and capacity. RAID can be categorized into Software RAID and Hardware RAID. In software RAID, the memory architecture is managed by the operating system. In case of hardware RAID, there is a dedicated controller and processor present inside the disks that manage the memory. There are various raid levels as discussed below.



...

RAID 0, Blocks Stripped, No Mirror, No Parity

Minimum 2 disks | Excellent performance ( as blocks are striped ) | No redundancy ( no mirror, no parity ) | Don’t use this for any critical system.

RAID 0 is based on data striping. A stream of data is divided into multiple segments or blocks and each of those blocks is stored on different disks. So, when the system wants to read that data, it can do so simultaneously from all the disks and join them together to reconstruct the entire data stream. The benefit of this is that the speed increases drastically for read and write operations. It is great for situations where performance is a priority over other aspects. Also, the total capacity of the entire volume is the sum of the capacities of the individual disks. The downside, as you may have already guessed it is that there is almost no redundancy. If one of the disks fails, the entire data becomes corrupt and worthless since it cannot be recreated anymore.

Advantages:
Performance boost for read and write operations Space is not wasted as the entire volume of the individual disks are used up to store unique data

Disadvantages
There is no redundancy/duplication of data. If one of the disks fails, the entire data is lost.


...

RAID 1, Blocks Mirrored, No Stripe, No Parity

Minimum 2 disks | Good performance ( no striping. no parity ) | Excellent redundancy ( as blocks are mirrored ).

RAID 1 uses the concept of data mirroring. Data is mirrored or cloned to an identical set of disks so that if one of the disks fails, the other one can be used. It also improves read performance since different blocks of data can be accessed from all the disks simultaneously. This can be explained in the diagram below. A multi-threaded process can access Block 1 from Disk 1 and Block 2 from Disk 2 at once thereby increasing the read speed just like RAID 0. But unlike RAID 0, write performance is reduced since all the drives must be updated whenever new data is written. Another disadvantage is that space is wasted to duplicate the data thereby increasing the cost to storage ratio.


Advantages:
Data can be recovered in case of disk failure Increased performance for read operation

Disadvantages
Slow write performance Space is wasted by duplicating data which increases the cost per unit memory




...

RAID 5, Blocks Stripped, Ditributed Parity

Minimum 3 disks | Good performance ( as blocks are striped ) | Good redundancy ( distributed parity ) | Best cost effective option providing both performance and redundancy. Use this for DB that is heavily read oriented. Write operations will be slow.

RAID 5 is very similar to RAID 4, but here the parity information is distributed over all the disks instead of storing them in a dedicated disk. This has two benefits — First, there is no more a bottleneck as the parity stress evens out by using all the disks to store parity information and second, there is no possibility of losing data redundancy since one disk does not store all the parity information.

Advantages:
All the advantages of RAID 4 plus increased write speed and better data redundancy

Disadvantages
Can only handle up to a single disk failure


...

RAID 6, Blocks Stripped, Two Distributed Parity

Just like RAID 5, Minimum 4 Disks required this does block level striping. However, it uses dual parity | In the above diagram A, B, C are blocks. p1, p2, p3 are parities | This creates two parity blocks for each data block | Can handle two disk failure | This RAID configuration is complex to implement in a RAID controller, as it has to calculate two parity data for each data block.

RAID 6 uses double parity blocks to achieve better data redundancy than RAID 5. This increases the fault tolerance for upto two drive failures in the array. Each disk has two parity blocks which are stored on different disks across the array. RAID 6 is a very practical infrastructure for maintaining high availability systems.

Advantages:
Better data redundancy. Can handle upto 2 failed drives

Disadvantages
Large parity overhead