On Jan 23, 2007 17:23 -0500, Jerome, Ron wrote:> I am going to setup a lustre file system using the following hardware and
am looking for some advice on how to apportion the disk space...
>
> * 4 OSS''s, each with 16 500G SATA drives on a ARECA RAID
controller
> * 1 MDS with a system drive and two 300G drives for metadata
>
> Basically I''m wondering what would be the best way to slice up the
raid arrays. The choices are one raid set and one large LUN, one raid set and
multiple smaller LUN''s (OST''s), or multiple smaller raid sets.
I would recommend against making a single 8TB OST per OSS. You can get
better performance by using a couple of smaller OSTs, just due to the
fact that you can get more IOPS by having independent RAID sets instead
of a single one. Also, you are more likely to do full RAID-width IOs
if they are smaller. Something like 3x(4+1) raid sets + 1 hot spare
seems like a good usage of the disks. That gives you 3 OSTs/OSS, each
one being 4 * 500MB = 2TB in size.
Alternately 2x(7+1), but this has the slight drawback that individual
1MB Lustre IOs are not full RAID chunks (more overhead) and no hot
spares but has the benefit of less overhead (using 2x(6+1)+2 still has
4 "overhead" disks like the 3x(4+1)+1 setup does).
Of course benchmarking is the way to go here, you should get the
lustre-iokit and test it with sgp-dd survey (which is destructive to
the data on the disks, so do that first).
Also be sure to use the mke2fs "-R stride=" option to have it spread
the
bitmaps over all of the disks.
Definitely RAID1 for the MDS.
Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.