RAID 5 with 22 disks

2007-12-25 10:36:00

Hi Managers,

Thanks to:

From: "Birger A. Wathne" <birger@Vest.Skrivervik.No>

From: vnarayan@haverford.edu (Vasantha Narayanan)

From: patesa@aur.alcatel.com (Sanjay Patel)

From: David Robson <robbo@box.net.au>

From: Jim Harmon <jharmon@telecnnct.com>

I had 22 disks, 2Gig each with which I wanted to set up RAID5. But I came

across a previous summary that said:

"Sun advised that more than 6 disks in a RAID 5 stripe was _bad_. Brian Wong's

paper suggested that we can create RAID 5 stripes, each with six disks, and

then _concatenate_ them together to make larger devices!"

        In Brian Wong's paper

        (http://www.sun.com/sunworldonline/swol-09-1995/swol-09-raid5.html)

        I found out the answer to my question: Why it is wise to limit the

        width of a parity RAID volume to no more than 6 disks:

        Suppose you have a 30 disk RAID with parity (RAID 3 or RAID 5), and

        one of them fails; a read would require 29 physical I/O operations -

        to recover the failed member's data! Writes to such a volume are also

        very expensive. He says that most array software permits the

        concatenation of multiple RAID volumes if larger capacity volumes are

        required.

        Jim Harmon thought that "limit" the paper talks about is the RAID

        controller, since a RAID controller (per channel) can only support 6

        drives on one chain. While it is true that the _controller_ can

        support only 6 devices, this is not what the paper was talking about

        (see above). I had these disks on 5 different channels.

        Jim went on to give some useful info about controllers:

        However, that is FAST NARROW SCSI. In a FAST WIDE SCSI system, it is

        theoretically possible to mount 256 drives and under the right

        management program, treat them all as one huge virtual drive.

                Typically, a 4-channel or 5-channel RAID Controller can easily

        control 32 -or more- drives, and under various levels of raid, can be

        configured as seperate drives, collections of drives, or one virtual

        drive. Mirroring, striping, hotswapping, etc. can all be mixed under

        the newer controllers.

        David Robson commented: With RAID 5, write access is considerably

        slower than normal and causes a significant system overhead

        continualy recalculating the parity. If a disk dies, the system will

        (should) hold up but performance will degrade further, and then when

        you recover the replacment disk it will take quite some time! You

        should also note that "growing" a RAID 5 metadisk is not recommended,

        which means if you have 4 disks in a RAID 5 device and try to add two

        more, performance may be reduced. this means you will have to back off

        your data and recreate the entire device! If you can afford the

        disks, concatenate and then mirror to gain redundancy (thats what Sun

        recommended to me).

        He's right about everything except the "write access is considerably

        slower than normal and causes a significant system overhead continualy

        recalculating the parity" part. Brian Wong's paper says that this is

        a common misconception. According to him "This process is commonly and

        erroneously thought to be the most expensive part of RAID-5 overhead,

        but parity computation consumes less than a millisecond, a figure

        dwarfed by the typical 3-15 millisecond service times for I/O to

        member disks.

My first question was if I could create 4 independent metadevices (RAID 5),

each with one hot spare and then mount each of the metadevices under a

different mount point.

        The anwer is yes. It is indeed possible. I went ahead and setup 4 RAID5

        metadevices each with 5 disks. But instead of asssociating each

        metadevice with a hotspare, (thanks to help from Sanjay Patel, Birger A.

        Wathne and Vasantha Narayanan) I created a hot spare pool with 2 hot

        spares in it. I associated the pool with each metadevice by indicating

        this in the md.tab file.

        Vasantha Narayanan wasn't sure if we could use a single disk as the

        hotspare for multiple metadevices. This is definitely possible. All

        you need to do is to indicate this in md.tab.

        Birger A. Wathne pointed out that I did not need one spare for each

        raid set (as was my original plan). He felt that 1 spare disk for all

        raid sets should be enough since raid5 sets can run with a disk failure.

        He also said: The rule is that you cannot survive two failed disks in

        the same RAID 5 set. With one hot spare the first disk failure gets the

        RAID set containing the failed disk in a critical situation only for a

        limited time. The file system cannot survive another hit while the hot

        spare is syncing up. But after that, you are ready to take minimum two

        more blows before you lose any file system.

                I have been told to expect 1 to 2 % failed disks each year in

        big disk farms. My own experience is that the failure rate for new

        disks is rather high the first months. So be very vigilant for the first

        2 months.

My second question was: If I create RAID 5 stripes, and then concatenate them

together to make larger devices, would this be a good thing to do? If so, how

would I do it?

        I ended up not doing any striping (I'm quitting this Friday, and I did

        not want to leave behind something I wasn't sure of).

        Sanjay Patel's suggestions:

        Hot spare pool -> 2 disks

        RAID 5 - 1 -> 6 disks

        RAID 5 - 2 -> 6 disks

        RAID 5 - 3 -> 5 disks

        RAID 5 - 4 -> 3 disks

        concat/stripe 1 -> contains RAID 5 - [ 1 thru 4 ]

        total disk space available (raw) will equal 16 x disk size

        note. 2.1 GB disks have a formated capacity of 1.8 GB.

        RAW = (16 * 1.8) = 28.8 GB

        attach the hot spare pool to all raid stripes.

        if your disks are hot swapable, then i would only have one hot

        spare and place the extra disk with the RAID 5 - 3 stripe.

        

        to create a concatentate/stripe of all the raid devices in disk suite,

        simply create an empty contactentate/stripe and place all of the raid

        devices you have previously created into the concatentate/stripe as if

        they were normal disks.

        

        a hot swapable disk is a a disk that can be unplugged while the system

        is running. most SSAs (110, 112, 114) are not truely hot swapable since

        an entire tray has to be removed to replace a disk. how swapable arrays

        include Netras, DiskPacks (the new type), and the RSM arrays.

        

        An example of concatenation in md.tab for Solaris 2.5.1 & DiskSuite 4.0:

        

        if you are starting this server from scratch, i would recommend you get

        SDS 4.1 (dont forget to to download the patches). if you dont have a

        copy and you need to use SDS 4.0:

        

        create the RAID 5 devices then to concatenate:

        

        /dev/md/dsk/d? 4 1 /dev/md/dsk/d? 1 /dev/md/dsk/d? 1 \

                /dev/md/dsk/d? 1 /dev/md/dsk/d?

        

        the first d? is the next available metadevice number

        the 4 is the number of items to concatenate followed by the devices that

        are part of the concatenate (ie. your raid stripes)

        

        in SDS 4.1, its all GUI, and its all point-click, drag & drop :->

---------------------------------------------------------------------------

Thanks much.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

~ Rasana Atreya Voice: (415) 476-3623 ~

~ System Administrator Fax: (415) 476-4653 ~

~ Library & Ctr for Knowledge Mgmt, Univ. of California at San Francisco ~

~ 530 Parnassus Ave, Box 0840, San Francisco, CA 94143-0840 ~

~ atreya@library.ucsf.edu ~

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Comments

Got something to say?

You must be logged in to post a comment.