RAID and RAID Administration
RAID (Redundant Array of Independent Disks) refers to multiple independent hard drives combined to form a single large logical array. Data is stored on the array of disks with additional redundancy information. The redundancy information can be either the data itself (mirroring), or the parity information calculated from several data blocks (RAID 4, or RAID 5). When RAID is in place, the operating system no longer deals with the individual drives, but instead with a set of virtual disks.
RAID is used for the dual purpose of increasing performance and redundancy, with different RAID levels offering different solutions. RAID prevents downtime in the event of a hard disk failure, however, it cannot recover data that has been deleted by a user or destroyed by a major event such as a fire.
Where exactly is the RAID process executed? RAID can reside in several places in the I/O path such as:
Each level of RAID spreads the data across the drives of the array in a different way and is optimized for specific situations. Let us examine the most common RAID levels.
Striping maps data so that the data is interleaved among two or more physical disks of the array. A striped disk contains two or more sub-disks. In this way, portions of two or more hard drives are combined and the read/write performance is improved. However, no redundancy information is stored in a RAID 0 array, which means that if one hard drive fails, all data is lost. RAID 0 is thus usually not used in servers where availability is a concern.
In a RAID 1 system, identical data is stored on two hard disks (100 percent redundancy). When one disk drive fails, all data is immediately available on the other without any impact on performance or data integrity. We refer to "disk mirroring" when two disk drives are mirrored on one SCSI channel. If each disk drive is connected to a separate SCSI channel, it is referred to as "disk duplexing" (additional security). RAID 1 represents an easy and highly efficient solution for data security and system availability.
This RAID level provides redundancy and high availability. One disk may fail but the logical drive with the data is still available. However, this level requires 2 disks, but only one counts as volume for storage. In some advanced controllers reads and writes can be made to the disks in a mirror or duplex.
RAID 4 is very similar to RAID 0. Data is striped across the disk drives. However, the RAID 4 controller calculates redundancy (parity information) and stores the information on a separate disk drive (P1, P2). Even when one disk drive fails, all data is still fully available. The missing data is derived from the data that remains available and from the parity information. Unlike RAID 1, only the capacity of one disk drive is needed for the redundancy. If we consider, for example, a RAID 4 disk array with 5 disk drives, 80 percent of the installed disk drive capacity is available as user capacity and only 20 percent is used for redundancy. In situations with many small data blocks, the parity disk drive becomes a throughput bottleneck. With large data blocks, RAID 4 shows significant performance gains.
RAID 4 provides high availability since one disk may fail, but the logical drive with the data is still available. This makes good use of disk capacity (array of n disks, n-1 is used for data storage). However, this method involves the complex calculation of redundancy information, which limits the write performance.
Unlike RAID 4, the parity data in a RAID 5 disk array is striped across all disk drives. The RAID 5 disk array delivers a more balanced throughput. The response time is excellent even with small data blocks, which are very common in multitasking and multi-user environments. RAID 5 offers the same level of security as RAID 4 - when one disk drive fails, all data is still fully available. The missing data is recalculated from the data that remains available and from the parity information.
RAID 5 offers the advantage of high availability since one disk may fail, but the logical drive with the data is still available. This method also provides very good use of disk capacity (array of n disks, n-1 is used for data storage). However, calculation of redundancy information limits the write performance.
RAID 10 and RAID 0+1
RAID 10 or RAID 0+1 is a combination of RAID 0 (Performance) and RAID 1 (Data Security). Unlike RAID 4 and RAID 5, there is no need to calculate parity information. RAID 10 (or 0+1) disk arrays offer good performance and data security. Similar to RAID 0, optimum performance is achieved in highly sequential load situations. Like RAID 1, 50 percent of the installed capacity is lost for redundancy. However, I/O throughput may be increased with advanced RAID controllers.
This level provides high availability since one disk may fail, but the logical drive with the data is still available. It also provides good write performance. However, it requires an even number of disks with a minimum of 4. Only half of the disk volume capacity is used but most of the I/O capacity is usually available. This level is often recommended and is the most popular for database systems.
For more information, see the book Oracle 11g Grid and Real Application Clusters 30% off if you buy it directly from Rampant TechPress .
Written by top Oracle experts, this RAC book has a complete online code depot with ready to use RAC scripts.