What is RAID & it's protection levels?
Minimum 2 disks | Excellent performance ( as blocks are striped ) | No redundancy ( no mirror, no parity ) | Don’t use this for any critical system.
RAID 0 is based on data striping. A stream of data is divided into multiple segments or blocks and each of those blocks is stored on different disks. So, when the system wants to read that data, it can do so simultaneously from all the disks and join them together to reconstruct the entire data stream. The benefit of this is that the speed increases drastically for read and write operations. It is great for situations where performance is a priority over other aspects. Also, the total capacity of the entire volume is the sum of the capacities of the individual disks. The downside, as you may have already guessed it is that there is almost no redundancy. If one of the disks fails, the entire data becomes corrupt and worthless since it cannot be recreated anymore.
Advantages:
Performance boost for read and write operations
Space is not wasted as the entire volume of the individual disks are used up to store unique data
Disadvantages
There is no redundancy/duplication of data. If one of the disks fails, the entire data is lost.
Minimum 2 disks | Good performance ( no striping. no parity ) | Excellent redundancy ( as blocks are mirrored ).
RAID 1 uses the concept of data mirroring. Data is mirrored or cloned to an identical set of disks so that if one of the disks fails, the other one can be used. It also improves read performance since different blocks of data can be accessed from all the disks simultaneously. This can be explained in the diagram below. A multi-threaded process can access Block 1 from Disk 1 and Block 2 from Disk 2 at once thereby increasing the read speed just like RAID 0. But unlike RAID 0, write performance is reduced since all the drives must be updated whenever new data is written. Another disadvantage is that space is wasted to duplicate the data thereby increasing the cost to storage ratio.
Advantages:
Data can be recovered in case of disk failure
Increased performance for read operation
Disadvantages
Slow write performance
Space is wasted by duplicating data which increases the cost per unit memory
Minimum 3 disks | Good performance ( as blocks are striped ) | Good redundancy ( distributed parity ) | Best cost effective option providing both performance and redundancy. Use this for DB that is heavily read oriented. Write operations will be slow.
RAID 5 is very similar to RAID 4, but here the parity information is distributed over all the disks instead of storing them in a dedicated disk. This has two benefits — First, there is no more a bottleneck as the parity stress evens out by using all the disks to store parity information and second, there is no possibility of losing data redundancy since one disk does not store all the parity information.
Advantages:
All the advantages of RAID 4 plus increased write speed and better data redundancy
Disadvantages
Can only handle up to a single disk failure
Just like RAID 5, Minimum 4 Disks required this does block level striping. However, it uses dual parity | In the above diagram A, B, C are blocks. p1, p2, p3 are parities | This creates two parity blocks for each data block | Can handle two disk failure | This RAID configuration is complex to implement in a RAID controller, as it has to calculate two parity data for each data block.
RAID 6 uses double parity blocks to achieve better data redundancy than RAID 5. This increases the fault tolerance for upto two drive failures in the array. Each disk has two parity blocks which are stored on different disks across the array. RAID 6 is a very practical infrastructure for maintaining high availability systems.
Advantages:
Better data redundancy. Can handle upto 2 failed drives
Disadvantages
Large parity overhead