Hi all, Attached is my first attempt at characterizing the performance of a DDN S2A 9500 under various loads using sgpdd-survey. I tested IO sizes from 512K to 4M with reads, writes with writeback cache enabled, and writes with writeback cache disabled. As can be seen from the graphs, performance really does improve with 4M IOs, especially for writes without WB. However, even with 4M, reads and writes without WB never reach the nice "plateau" seen with the S2A 8500 and 1M IOs. I will include DDN settings (there''s even a tab set aside for them) but I''m not sure what''s useful - can anyone suggest what would be good to include? I have ''showall'' from the controller, but that is far too voluminous (and also sensitive information.) Also if anyone can suggest tunings that are likely to improve either read or write performance, please let me know and I will try them if possible. We have done the standard tuning shown at: https://mail.clusterfs.com/wikis/lustre/LustreDdnTuning I plan on performing an obdfilter-survey on the same hardware, and am in the process of doing a similar study on Thumper hardware as well. Cheers, Jody -------------- next part -------------- A non-text attachment was scrubbed... Name: 9500-sgp_dd.xls Type: application/vnd.ms-excel Size: 129536 bytes Desc: not available Url : http://mail.clusterfs.com/pipermail/lustre-devel/attachments/20061208/007d8945/9500-sgp_dd-0001.xls
Jody - In OpenOffice the graphs don''t show well. I agree with your interpretation, there are worries here. I wouldn''t start with the obd survey until we have learned how to tune sgp further. For the runs that Eric Barton did, the critical parameters were the MF (a read ahead factor) and readahead setting. What did you set those too? I have one question about the IO kit - could it write to many regions (say 10,000) with fewer threads? This is a realistic situation under Lustre load on larger clusters I think. Finally, I have forwarded the graphs to DDN - I hope they can comment. - Peter - Jody McIntyre wrote:> Hi all, > > Attached is my first attempt at characterizing the performance of a DDN > S2A 9500 under various loads using sgpdd-survey. > > I tested IO sizes from 512K to 4M with reads, writes with writeback > cache enabled, and writes with writeback cache disabled. As can be seen > from the graphs, performance really does improve with 4M IOs, especially > for writes without WB. However, even with 4M, reads and writes without > WB never reach the nice "plateau" seen with the S2A 8500 and 1M IOs. > > I will include DDN settings (there''s even a tab set aside for them) but > I''m not sure what''s useful - can anyone suggest what would be good to > include? I have ''showall'' from the controller, but that is far too > voluminous (and also sensitive information.) > > Also if anyone can suggest tunings that are likely to improve either > read or write performance, please let me know and I will try them if > possible. We have done the standard tuning shown at: > https://mail.clusterfs.com/wikis/lustre/LustreDdnTuning > > I plan on performing an obdfilter-survey on the same hardware, and am in > the process of doing a similar study on Thumper hardware as well. > > Cheers, > Jody > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-devel mailing list > Lustre-devel@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-devel >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.clusterfs.com/pipermail/lustre-devel/attachments/20061209/1aa6ad1f/attachment.html
Jody, I can''t see the DDN settings - the ddn setup worksheet says "TODO".> I have one question about the IO kit - could it write to many > regions (say 10,000) with fewer threads? This is a realistic > situation under Lustre load on larger clusters I think.sgpdd-survey isn''t coded to handle #regions > #threads. I think this would need a custom program to do. If we do that, then the program should run the whole survey.> Finally, I have forwarded the graphs to DDN - I hope they can > comment.I already tried to get DDN to grok that ever-increasing I/O size isn''t a panacea. This is useful evidence. It would be really useful for a DDN person to crawl over a listing of all the settings that these measurements were taken with. Cheers, Eric --------------------------------------------------- |Eric Barton Barton Software | |9 York Gardens Tel: +44 (117) 330 1575 | |Clifton Mobile: +44 (7909) 680 356 | |Bristol BS8 4LL Fax: call first | |United Kingdom E-Mail: eeb@bartonsoftware.com| ---------------------------------------------------
Hi Eric, On Sat, Dec 09, 2006 at 03:56:57PM -0000, Eric Barton wrote:> I can''t see the DDN settings - the ddn setup worksheet says "TODO".Yes, they are todo, mainly because I have 800K of settings and they are unlikely to all be useful. I''ll add at least MF and readahead as mentioned by Braam, and please let me know if you have any other requests.> It would be really useful for a DDN person to crawl over a listing > of all the settings that these measurements were taken with.I can almost certainly send all the settings to DDN confidentially. Cheers, Jody> Cheers, > Eric > > --------------------------------------------------- > |Eric Barton Barton Software | > |9 York Gardens Tel: +44 (117) 330 1575 | > |Clifton Mobile: +44 (7909) 680 356 | > |Bristol BS8 4LL Fax: call first | > |United Kingdom E-Mail: eeb@bartonsoftware.com| > --------------------------------------------------- > >--
Jody - That looks extremely ominous with the prefetchCeiling - if that is like Barton''s readahead parameter we are re-creating new problems. Perhaps we can set this to 0? - Peter -> > I can''t see any setting that looks exactly like what you described, > but > I found the following: > > lun_maxPrefetch x 1 > lun_mfbit On > lun_prefetchCeiling 65535 > > Cheers, > Jody
On Mon, Dec 11, 2006 at 10:10:01AM -0700, Peter J Braam wrote:> Jody - > > That looks extremely ominous with the prefetchCeiling - if that is > like Barton''s readahead parameter we are re-creating new problems. > Perhaps we can set this to 0?We will rerun the read test with maxPrefetch and prefetchCeiling set to 0. Cheers, Jody> - Peter - > > > > > >I can''t see any setting that looks exactly like what you described, > >but > >I found the following: > > > >lun_maxPrefetch x 1 > >lun_mfbit On > >lun_prefetchCeiling 65535 > > > >Cheers, > >Jody--
Jody, Can you check whether sgpdd survey will preserve and/or control I/O alignment. It''s been brought to my attention that more recent DDNs are sensitive to alignment, which didn''t seem to be an issue when I first wrote that script. First thing to do would be to check the DDN stats offset. Cheers, Eric
Hi all, Here is a completely new set of sgpdd-survey results from a DDN S2A 9550. After much investigation, it seems there were several problems with the surveys I posted earlier, mainly a minor hardware issue (which DDN spotted and fixed almost immediately) and a misconfiguration of sgpdd-survey itself (which we will address in a future release of the tool.) I also no longer have access to the hardware used for the first set of surevys, so I discarded all these results and started again. You can see that with the right settings (4 MB IOs, no prefetch, maxcmds=16, and sgpdd-survey using the correct block size of 4096), read performance now "plateaus" nicely as with an S2A 8500. However, it does so at a relatively low number for region counts above 16. I expect we can improve these numbers even furthur and will work with DDN on this, so don''t take these graphs as "final" in any way. Thanks to Oak Ridge National Laboratory for access to the system used to perform the first set of surveys, to Indiana University for access to the system used for this set of surveys, and to DDN for their help so far in tuning the systems. Cheers, Jody -------------- next part -------------- A non-text attachment was scrubbed... Name: 9500-sgp_dd.xls Type: application/vnd.ms-excel Size: 375808 bytes Desc: not available Url : http://mail.clusterfs.com/pipermail/lustre-devel/attachments/20061229/eb598846/9500-sgp_dd-0001.xls
Peter Kjellstrom
2006-Dec-29 11:43 UTC
[Lustre-devel] sgpdd-survey of DDN S2A 9500, take 2
On Friday 29 December 2006 18:49, Jody McIntyre wrote:> Hi all, > > Here is a completely new set of sgpdd-survey results from a DDN S2A > 9550.Any chance that those graphs could be made available in some way for non-excel people? (they don''t render right in OpenOffice). Maybe export to bitmap or reduced in complexity? Curious to see what they look like, Peter K -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.clusterfs.com/pipermail/lustre-devel/attachments/20061229/bba3d063/attachment.bin
Brian J. Murrell
2006-Dec-29 11:58 UTC
[Lustre-devel] sgpdd-survey of DDN S2A 9500, take 2
On Fri, 2006-12-29 at 19:44 +0100, Peter Kjellstrom wrote:> On Friday 29 December 2006 18:49, Jody McIntyre wrote: > > Hi all, > > > > Here is a completely new set of sgpdd-survey results from a DDN S2A > > 9550. > > Any chance that those graphs could be made available in some way for non-excel > people?They rendered in Gnumeric pretty well. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://mail.clusterfs.com/pipermail/lustre-devel/attachments/20061229/ebd58d4f/attachment.bin
Peter Kjellstrom
2006-Dec-29 12:05 UTC
[Lustre-devel] sgpdd-survey of DDN S2A 9500, take 2
On Friday 29 December 2006 19:58, Brian J. Murrell wrote:> On Fri, 2006-12-29 at 19:44 +0100, Peter Kjellstrom wrote: > > On Friday 29 December 2006 18:49, Jody McIntyre wrote: > > > Hi all, > > > > > > Here is a completely new set of sgpdd-survey results from a DDN S2A > > > 9550. > > > > Any chance that those graphs could be made available in some way for > > non-excel people? > > They rendered in Gnumeric pretty well.Indeed they do :-) I usually use gnumeric but I didn''t expect it to beat OO-2 on excel imports... I''m happy, ignore my previous requests. /Peter -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.clusterfs.com/pipermail/lustre-devel/attachments/20061229/b10422d0/attachment.bin
Hi Jody, Please provide more details regarding the HW setup (e.g. disk drive model and number of drives used, what kind of storage interface was used, RAID configuration). I could not find it in the file. Thanks, Mirko Jody McIntyre schrieb:> Hi all, > > Here is a completely new set of sgpdd-survey results from a DDN S2A > 9550. After much investigation, it seems there were several problems > with the surveys I posted earlier, mainly a minor hardware issue (which > DDN spotted and fixed almost immediately) and a misconfiguration of > sgpdd-survey itself (which we will address in a future release of the > tool.) I also no longer have access to the hardware used for the first > set of surevys, so I discarded all these results and started again. > > You can see that with the right settings (4 MB IOs, no prefetch, > maxcmds=16, and sgpdd-survey using the correct block size of 4096), read > performance now "plateaus" nicely as with an S2A 8500. However, it does > so at a relatively low number for region counts above 16. I expect we > can improve these numbers even furthur and will work with DDN on this, > so don''t take these graphs as "final" in any way. > > Thanks to Oak Ridge National Laboratory for access to the system used to > perform the first set of surveys, to Indiana University for access to > the system used for this set of surveys, and to DDN for their help so > far in tuning the systems. > > Cheers, > Jody > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-devel mailing list > Lustre-devel@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-devel >
Jody McIntyre wrote:> Hi all, > > Here is a completely new set of sgpdd-survey results from a DDN S2A > 9550. After much investigation, it seems there were several problems > with the surveys I posted earlier, mainly a minor hardware issue (which > DDN spotted and fixed almost immediately) and a misconfiguration of > sgpdd-survey itself (which we will address in a future release of the > tool.) I also no longer have access to the hardware used for the first > set of surevys, so I discarded all these results and started again. > > You can see that with the right settings (4 MB IOs, no prefetch, > maxcmds=16, and sgpdd-survey using the correct block size of 4096), read > performance now "plateaus" nicely as with an S2A 8500. However, it does > so at a relatively low number for region counts above 16. I expect we > can improve these numbers even furthur and will work with DDN on this, > so don''t take these graphs as "final" in any way. >Jody, Is it expected that the prime settings for a single port will be relevant when all ports are in use? Is there any way to run this suite across all the 9500 interfaces simultaneously? paul> Thanks to Oak Ridge National Laboratory for access to the system used to > perform the first set of surveys, to Indiana University for access to > the system used for this set of surveys, and to DDN for their help so > far in tuning the systems. > > Cheers, > Jody > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-devel mailing list > Lustre-devel@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-devel >
Hi Mirko, On Tue, Jan 02, 2007 at 08:42:34AM +0100, Mirko Benz wrote:> Please provide more details regarding the HW setup (e.g. disk drive > model and number of drives used, what kind of storage interface was > used, RAID configuration). I could not find it in the file.The disks are 10x ST350064 in an 8+2 configuration. The storage interface was 4 Gbit FC. No Special RAID configuration except a block size of 4096 (which we recommend for all DDN+Lustre installations.) Cheers, Jody> > Thanks, > Mirko > > Jody McIntyre schrieb: > >Hi all, > > > >Here is a completely new set of sgpdd-survey results from a DDN S2A > >9550. After much investigation, it seems there were several problems > >with the surveys I posted earlier, mainly a minor hardware issue (which > >DDN spotted and fixed almost immediately) and a misconfiguration of > >sgpdd-survey itself (which we will address in a future release of the > >tool.) I also no longer have access to the hardware used for the first > >set of surevys, so I discarded all these results and started again. > > > >You can see that with the right settings (4 MB IOs, no prefetch, > >maxcmds=16, and sgpdd-survey using the correct block size of 4096), read > >performance now "plateaus" nicely as with an S2A 8500. However, it does > >so at a relatively low number for region counts above 16. I expect we > >can improve these numbers even furthur and will work with DDN on this, > >so don''t take these graphs as "final" in any way. > > > >Thanks to Oak Ridge National Laboratory for access to the system used to > >perform the first set of surveys, to Indiana University for access to > >the system used for this set of surveys, and to DDN for their help so > >far in tuning the systems. > > > >Cheers, > >Jody > > > >------------------------------------------------------------------------ > > > >_______________________________________________ > >Lustre-devel mailing list > >Lustre-devel@clusterfs.com > >https://mail.clusterfs.com/mailman/listinfo/lustre-devel > >--
Hi Paul, On Tue, Jan 02, 2007 at 07:43:57AM -0500, Pauln wrote:> Is it expected that the prime settings for a single port will be > relevant when all ports are in use?Yes.> Is there any way to run this suite > across all the 9500 interfaces simultaneously?Yes there is, but given the above, it''s unnecessary. If you''ve looked at the Sun Thumper graphs, you will see I surveyed all MD devices at once for most of those surveys because it does make a difference with that hardware. Cheers, Jody
Any chance to test with more drives (e.g. 40 to match the Sun Thumper setup) and over all interfaces? Regards, Mirko Jody McIntyre schrieb:> Hi Mirko, > > On Tue, Jan 02, 2007 at 08:42:34AM +0100, Mirko Benz wrote: > > >> Please provide more details regarding the HW setup (e.g. disk drive >> model and number of drives used, what kind of storage interface was >> used, RAID configuration). I could not find it in the file. >> > > The disks are 10x ST350064 in an 8+2 configuration. The storage > interface was 4 Gbit FC. No Special RAID configuration except a block > size of 4096 (which we recommend for all DDN+Lustre installations.) > > Cheers, > Jody > > >> Thanks, >> Mirko >> >> Jody McIntyre schrieb: >> >>> Hi all, >>> >>> Here is a completely new set of sgpdd-survey results from a DDN S2A >>> 9550. After much investigation, it seems there were several problems >>> with the surveys I posted earlier, mainly a minor hardware issue (which >>> DDN spotted and fixed almost immediately) and a misconfiguration of >>> sgpdd-survey itself (which we will address in a future release of the >>> tool.) I also no longer have access to the hardware used for the first >>> set of surevys, so I discarded all these results and started again. >>> >>> You can see that with the right settings (4 MB IOs, no prefetch, >>> maxcmds=16, and sgpdd-survey using the correct block size of 4096), read >>> performance now "plateaus" nicely as with an S2A 8500. However, it does >>> so at a relatively low number for region counts above 16. I expect we >>> can improve these numbers even furthur and will work with DDN on this, >>> so don''t take these graphs as "final" in any way. >>> >>> Thanks to Oak Ridge National Laboratory for access to the system used to >>> perform the first set of surveys, to Indiana University for access to >>> the system used for this set of surveys, and to DDN for their help so >>> far in tuning the systems. >>> >>> Cheers, >>> Jody >>> >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> Lustre-devel mailing list >>> Lustre-devel@clusterfs.com >>> https://mail.clusterfs.com/mailman/listinfo/lustre-devel >>> >>> > >
Hi Mirko, On Wed, Jan 03, 2007 at 10:37:34AM +0100, Mirko Benz wrote:> Any chance to test with more drives (e.g. 40 to match the Sun Thumper > setup) and over all interfaces?I don''t have any present plans to do this. The intention of these surveys is _not_ to compare Sun and DDN hardware, but rather to determine how to get the best performance out of each. Cheers, Jody