Is anyone using RAID6 in production? In moving from hardware RAID on my dual
3ware 7500-8 based systems to md, I decided I'd like to go with RAID6
(since md is less tolerant of marginal drives than is 3ware). I did some
benchmarking and was getting decent speeds with a 128KiB chunksize.
So the next step was failure testing. First, I fired off memtest.sh as
found at <http://people.redhat.com/dledford/memtest.html>. Then, I did
'mdadm /dev/md0 -f /dev/sdo1', and it started to rebuild as it should.
I
cranked up /proc/sys/dev/raid/speed_limit_min to 15000 so that it would
reconstruct in a decent amount of time (the default of 1000 was leading to
a 53 hour estimate for the recovery).
But memtest.sh started kicking out errors (non-matching diffs). And then
I got this:
EXT3-fs error (device md0): ext3_journal_start_sb: Detected aborted journal
Remounting filesystem read-only
attempt to access beyond end of device
md0: rw=0, want=28987566088, limit=4595422208
attempt to access beyond end of device
md0: rw=0, want=28987566088, limit=4595422208
attempt to access beyond end of device
md0: rw=0, want=28987566088, limit=4595422208
Needless to say it's not giving me that warm fuzzy feeling. The one
caveat is that not all the members of my array were the same size -- one
disk is 180GB while all the rest are 160GB. I'm going to test overnight
with identically sized RAID members, but I also wanted to see if anyone
else is using RAID6.
Thanks.
--
Joshua Baker-LePain
Department of Biomedical Engineering
Duke University