thr3ads.net - Lustre discuss - [Lustre-discuss] Hardware or software RAID? [Nov 2008]

If this information is useful, please help other people find it:
Share via:

Mark Dixon

2008-Nov-20 12:26 UTC

[Lustre-discuss] Hardware or software RAID?

Hi,

Which method of RAID is preferred with Lustre: hardware or software?

This might seem like a daft question, but I''m a newbie. The list, 
operations guide and various "best practice" papers do not appear to 
express a preference.

I''m thinking about a system with:

* 2x failover MDS with a RAID1 or RAID10 volume
* 2x failover OSS with RAID5 or RAID6 volumes.

I''m trying to gauge whether it''s worth having shared storage
arrays for
each failover set with hardware RAID, or just leave them as dual-attached 
JBODs.

My instinct says that hardware RAID from a reputable vendor is best - 
particularly because there''s a battery-backed cache - but I see from
the
lists that Lustre has put a lot of effort in improving the Linux MD RAID 
layer.

Thanks,

Mark
-- 
-----------------------------------------------------------------
Mark Dixon                       Email    : m.c.dixon at leeds.ac.uk
HPC/Grid Systems Support         Tel (int): 35429
Information Systems Services     Tel (ext): +44(0)113 343 5429
University of Leeds, LS2 9JT, UK
-----------------------------------------------------------------

Andreas Dilger

2008-Nov-21 04:50 UTC

head link

[Lustre-discuss] Hardware or software RAID?

On Nov 20, 2008  12:26 +0000, Mark Dixon wrote:> Which method of RAID is preferred with Lustre: hardware or software?
> 
> This might seem like a daft question, but I''m a newbie. The list, 
> operations guide and various "best practice" papers do not appear
to
> express a preference.
The great thing about Lustre is that you can make this decision entirely
based on what kind of price/performance/reliability you need, and not
because of a specific hardware requirement.
> I''m thinking about a system with:
> 
> * 2x failover MDS with a RAID1 or RAID10 volume
> * 2x failover OSS with RAID5 or RAID6 volumes.
> 
> I''m trying to gauge whether it''s worth having shared
storage arrays for
> each failover set with hardware RAID, or just leave them as dual-attached 
> JBODs.
> 
> My instinct says that hardware RAID from a reputable vendor is best - 
> particularly because there''s a battery-backed cache - but I see
from the
> lists that Lustre has put a lot of effort in improving the Linux MD RAID 
> layer.
There are a number of large clusters (TACC Ranger in particular) that use
software RAID on JBODs, but the majority of systems use hardware RAID in order
to
maximize performance (at an increased cost of course).

These days I would tend to recommend using RAID-6 over RAID-5 just because
the large disks available take a long time to rebuild, and there is a
non-zero risk of a second disk failing during that time.


Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Mark Dixon

2008-Nov-24 15:00 UTC

head link

[Lustre-discuss] Hardware or software RAID?

Thanks Klaus, Andreas,

"correct and slow" sounds generally preferable over "wrong and
fast" - so
I guess in this situation hardware RAID still wins.

I get worried when I hear Sun''s marketing department talk about the
"RAID5
write hole" whenever they tout ZFS. Clearly, moving to RAID6 does not 
solve this. Even if you have a UPS, systems sometimes still come down 
hard.

It''s interesting that TACC is using software RAID on JBODs, as clearly
the
cost of HW RAID would not have been an issue for them.

Many thanks once again,

Mark
-- 
-----------------------------------------------------------------
Mark Dixon                       Email    : m.c.dixon at leeds.ac.uk
HPC/Grid Systems Support         Tel (int): 35429
Information Systems Services     Tel (ext): +44(0)113 343 5429
University of Leeds, LS2 9JT, UK
-----------------------------------------------------------------

Jeff Layton

2008-Nov-24 18:12 UTC

head link

[Lustre-discuss] Hardware or software RAID?

Andreas Dilger wrote:> There are a number of large clusters (TACC Ranger in particular) that use
> software RAID on JBODs, but the majority of systems use hardware RAID in
order to
> maximize performance (at an increased cost of course).
>
> These days I would tend to recommend using RAID-6 over RAID-5 just because
> the large disks available take a long time to rebuild, and there is a
> non-zero risk of a second disk failing during that time.
>   
What about using more, but smaller raid groups? For example,
perhaps 4-5 drives in a RAID-5? That way if a disk fails, the
rebuilds are faster since there is less data?

Jeff

Peter Kjellstrom

2008-Nov-24 18:45 UTC

head link

[Lustre-discuss] Hardware or software RAID?

On Monday 24 November 2008, Jeff Layton wrote:> Andreas Dilger wrote:
> > There are a number of large clusters (TACC Ranger in particular) that
use
> > software RAID on JBODs, but the majority of systems use hardware RAID
in
> > order to maximize performance (at an increased cost of course).
> >
> > These days I would tend to recommend using RAID-6 over RAID-5 just
> > because the large disks available take a long time to rebuild, and
there
> > is a non-zero risk of a second disk failing during that time.
>
> What about using more, but smaller raid groups? For example,
> perhaps 4-5 drives in a RAID-5? That way if a disk fails, the
> rebuilds are faster since there is less data?
I''d pick raid6 not so much for the time-window/"drive fail"
as for the read
error rate. I''ve seen numbers for SATA drives at about once every 10TB
or so.
If so, then rebuilding a 10+1 raid5 is likely to (on average) see one sector 
read error (you''re allowed 0 to manage a perfect rebuild). A 5+1 set
would be
50% likely to hit one and so on.

How your raid controller (or software) reacts to a failed sector read varies 
but behaviours include: continue as if nothing happened (you now have bad 
data on your raid-set), fail the offending drive (and then also the rebuild 
and the entire raid-set), ...

/Peter
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20081124/902843b9/attachment.bin

Kevin Van Maren

2008-Nov-30 01:08 UTC

head link

[Lustre-discuss] Hardware or software RAID?

Hey Jeff,

Smaller _drives_ usually have shorter rebuild times, but having _fewer_ 
drives normally
makes little difference to rebuild times.  Both a 4+1 and an 8+1 need to 
write out "1 drive"
worth of data to the replacement drive (and read every other drive in 
the raid set once,
but those are all in parallel).

Kevin


Jeff Layton wrote:> Andreas Dilger wrote:
>   
>> There are a number of large clusters (TACC Ranger in particular) that
use
>> software RAID on JBODs, but the majority of systems use hardware RAID
in order to
>> maximize performance (at an increased cost of course).
>>
>> These days I would tend to recommend using RAID-6 over RAID-5 just
because
>> the large disks available take a long time to rebuild, and there is a
>> non-zero risk of a second disk failing during that time.
>>   
>>     
>
> What about using more, but smaller raid groups? For example,
> perhaps 4-5 drives in a RAID-5? That way if a disk fails, the
> rebuilds are faster since there is less data?
>
> Jeff
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>

Lustre discuss - Nov 2008 - Hardware or software RAID?

[Lustre-discuss] Hardware or software RAID?

[Lustre-discuss] Hardware or software RAID?

[Lustre-discuss] Hardware or software RAID?

[Lustre-discuss] Hardware or software RAID?

[Lustre-discuss] Hardware or software RAID?

[Lustre-discuss] Hardware or software RAID?