On Dec 19, 2003 14:05 +0800, Phil Schwan wrote:> Donny Cooper wrote: > > MDS Server Considerations: > > (1) Any recommended ratio of MDS memory to the number of OSTs. > > Does it make sense to assume, the greater number of OST''s, the more > > "muscle" your MDS should have? > > If you have large directories (1 million or more files) or a large > number of directories which see a lot of use, you may benefit > significantly from having more memory. In an environment where clients > randomly access one of 10 million files, for example, having extra > memory for the cache improves performance very significantly.Also, if you have a very heavy metadata load having faster/more CPUs on the MDS can improve performance. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/
Hi Donny-- Donny Cooper wrote:> > MDS Server Considerations: > (1) Any recommended ratio of MDS memory to the number of OSTs. > Does it make sense to assume, the greater number of OST''s, the more "muscle" > your MDS should have?The amount of MDS memory is not really related to the number of OSTs, no. There are two factors which I would consider in your decision: - the number of clients - the size of your directories, and variety of the load If you have many hundreds or thousands of clients, the default configuration will allow hundreds of thousands of locks to be acquired on the scale of the entire cluster. That can be tuned down, but it''s important to remember that clients can only cache things when they have a lock for it -- if they can hold very few locks, they can cache very little metadata. So it''s worth spending a few hundred extra dollars for a couple gigabytes of ram. If you have large directories (1 million or more files) or a large number of directories which see a lot of use, you may benefit significantly from having more memory. In an environment where clients randomly access one of 10 million files, for example, having extra memory for the cache improves performance very significantly.> OST Server configurations: > (1) Any recommended RAM size for the OSTs? Is more better? ...or is there a > point of diminishing return, depending on OST CPU speed and interconnect > bandwidth?Today the OSTs use memory to manage locks (as in the MDS) and for the read cache. We do not write through the page cache, so memory size should have no effect on write performance. My opinion is that memory on the OST is useful primarily for locks, unless your load will benefit significantly from read caching.> (2) Any recommended CPU speed for the OST, based on the interconnect type? > > (3) Could a dual-processor server be logically used as 2 OSTstripes, if 2> interconnect paths are fed into the system, and the output bandwidth is 2x of > the input? (Like 2x Gig-E in ==> 1x 2GBps Fibre out) Should is scale > "similar" to 2 uni-processor systems?Our experience is that the TCP stack consumes a lot of CPU. We have not made any serious attempts yet at 2xGige in/2 Gbit out, but we do have a rule of thumb: 1/3 CPU for network 1/3 CPU for disk backend 1/3 CPU for Lustre If you can remain within these constraints, I think it will be ok. If you want to make a serious attempt for a production site, then I predict that you will need some support beyond what the mailing list can provide. It is worth noting that Elan in/2 Gbps out works very well, and goes at ~92% of the raw speed of the disk. I think that the networking issues will be the hard ones, not at all the fibrechannel issues. Hope this helps-- -Phil
Thanks for the info Phil. I''ll post any of my successes (or failures) with the 2xGig-E paths into a single 2way OST, in case it''s of any interest to someone else. Regards, Donny On Fri December 19 2003 12:05 am, Phil Schwan wrote:>Hi Donny-- > >Donny Cooper wrote: >> >> MDS Server Considerations: >> (1) Any recommended ratio of MDS memory to the number of OSTs. >> Does it make sense to assume, the greater number of OST''s, the more"muscle">> your MDS should have? > >The amount of MDS memory is not really related to the number of OSTs, >no. There are two factors which I would consider in your decision: > >- the number of clients >- the size of your directories, and variety of the load > >If you have many hundreds or thousands of clients, the default >configuration will allow hundreds of thousands of locks to be acquired >on the scale of the entire cluster. That can be tuned down, but it''s >important to remember that clients can only cache things when they have >a lock for it -- if they can hold very few locks, they can cache very >little metadata. So it''s worth spending a few hundred extra dollars for >a couple gigabytes of ram. > >If you have large directories (1 million or more files) or a large >number of directories which see a lot of use, you may benefit >significantly from having more memory. In an environment where clients >randomly access one of 10 million files, for example, having extra >memory for the cache improves performance very significantly. > >> OST Server configurations: >> (1) Any recommended RAM size for the OSTs? Is more better? ...or is therea>> point of diminishing return, depending on OST CPU speed and interconnect >> bandwidth? > >Today the OSTs use memory to manage locks (as in the MDS) and for the >read cache. We do not write through the page cache, so memory size >should have no effect on write performance. > >My opinion is that memory on the OST is useful primarily for locks, >unless your load will benefit significantly from read caching. > >> (2) Any recommended CPU speed for the OST, based on the interconnect type? >> > (3) Could a dual-processor server be logically used as 2 OST >stripes, if 2 >> interconnect paths are fed into the system, and the output bandwidth is 2xof>> the input? (Like 2x Gig-E in ==> 1x 2GBps Fibre out) Should is scale >> "similar" to 2 uni-processor systems? > >Our experience is that the TCP stack consumes a lot of CPU. We have not >made any serious attempts yet at 2xGige in/2 Gbit out, but we do have a >rule of thumb: > >1/3 CPU for network >1/3 CPU for disk backend >1/3 CPU for Lustre > >If you can remain within these constraints, I think it will be ok. If >you want to make a serious attempt for a production site, then I predict >that you will need some support beyond what the mailing list can provide. > >It is worth noting that Elan in/2 Gbps out works very well, and goes at >~92% of the raw speed of the disk. I think that the networking issues >will be the hard ones, not at all the fibrechannel issues. > >Hope this helps-- > >-Phil > >-- Donny Cooper NEC Solutions (America), Inc. Advanced Technical Computing Center ph: +1-281-465-1506 email: Donny.Cooper@NECsam.com
Hello, Are there any "rules-of-thumb" for hardware considerations in the MDS and OST=20 Servers? Like topics below. Sorry to bother if this is covered in some=20 Lustre document, but I could not find anything in the PDFs from the website=20 and it was not covered in a recent tutorial I attended.=20 MDS Server Considerations: (1) Any recommended ratio of MDS memory to the number of OSTs. Does it make sense to assume, the greater number of OST''s, the more "muscle"=20 your MDS should have? OST Server configurations: (1) Any recommended RAM size for the OSTs? Is more better? ...or is there a=20 point of diminishing return, depending on OST CPU speed and interconnect=20 bandwidth? (2) Any recommended CPU speed for the OST, based on the interconnect type? (3) Could a dual-processor server be logically used as 2 OST stripes, if 2=20 interconnect paths are fed into the system, and the output bandwidth is 2x of=20 the input? (Like 2x Gig-E in =3D=3D> 1x 2GBps Fibre out) Should is scale=20 "similar" to 2 uni-processor systems? Not looking for any specific numbers or a hard and fast rule, just some=20 conceptual ideas. Any input is appreciated. Thanks, Donny --=20 Donny Cooper NEC Solutions (America), Inc. Advanced Technical Computing Center ph: +1-281-465-1506 email: Donny.Cooper@NECsam.com