[vox-tech] RAID hardware recommendations, software support info

ME vox-tech@lists.lugod.org
Wed, 30 Apr 2003 19:44:51 -0700 (PDT)


> Howdy!
>
> Does anybody here have RAID under Linux?

Yep.

> Can I ask some questions?:

Yep. (you just did!  Hah! ;-)

>    - What hardware do you use? (controller, drive model/type/size)

Dell PowerEdge server (3 YO) cale with MegaRAID controller that has/had
drivers built-into the kernel (2.4). Worked great.

I think I/We also used a qlogic RAID controller, (but I might be
mistaken.) However, for this one, though there was a loadable module
available from the vendor, the driver source was not built into linux.

Both were SCSI based RAID controllers, not IDE/EIDE.

>    - Do you use software RAID or harware RAID?

Have used both. I prefer Hardware-based.

>    - What software do you use to manage the RAID system?

Sofwtare? For the hardware based RAID, the controller comes with a mini
BIOS and set of tools accessible at boot time to permit configuring
logical volumes and RAID type/disks used for type. From here, I could set
a hot-spare for failover support. This is where initialization and format
for volumes was done.

>    - How do you know when a drive is damaged?

Hardware (only) For the ones we use(d) There is an audible beep that lets
us know there is an issue and this is associaated with a light on the
POwerVault storage tower (with 12-15 SCSI Drivers in hot-swappable bays)
and the "idiot lights" shine "red" when there is a failure, steady green
when life is good, and blink green when rebuilding. (I think this is what
it did --- been a while since a disk failure has occured.)

>    - How do you swap a drive when it gets damaged?

The continuous beeeeeeeeeeeeeeeeeeeeeepppp is really annoying. The red
light is also a sign for this. ;-)

failover is used as the first step - this involves an extra drive that is
just left in the bay that is not used. When a drive error comes up that is
too serious to "fix" the system automagically goes to the hot-spare.

As to when to swap, it is best to *not* hot-swap out a drive and then
hot-swap in a new one when you can avoid it. It can be done, but there is
risk. If you can afdford to schedule a down time, then that is best.

Then you power down the system and then pull out the bad drive, put in the
good replacement (with the same SCSI ID as the one taken out) and on
reboot, (most of the time) the RAID controller realizes the new drive has
no data and rebuilds the missing data.

Most vendors suggest you wait int he BIOS Utilities menu until the RAID
set is rebuilt. However, that can take a long time, and I have actually
just let the system boot into the OS and run while the hardware RAID
rebuilt the new drive data. This is slower, and there are some risks, but
we didnot encounter any of the risks the few tiumes I have done this.

>    - Has a drive failed on you before?

Yes. I have only encountered a few failures in my time on RAID systems. I
think I am up to about 20 or so.

>  If so, how did it go with the drive swapping?

Never had any problems. Only attempted 1 or 2 hot-swaps, and that was a
scary thing 8-o   I dont like doing that. State changes from off to on and
risks with SCSI device initialization on a running bus that has potential
for power/signal drop for other devices and mucking up data on the SCSI
bus passed for writes/reads always seems risky to me --- even when you
have hardware that is designed to do that.

When you can do it, safe power down, and replaced while powered off is
best. HotSpares are also good for giving you buffer space to schedule your
down time weeks or months later.

NOTE: not all RAID hardware can be hot-swapped! Attempting to Hot-Swap on
hardware not designed for it is more risky! Danger! Danger Will Robinson!

-ME