I have been looking into Lustre as an alternative for our shared file system on a cluster of about 20+ nodes. The configuration I am using for Lustre is a system that has a second unused sata drive that I have partitioned into 1 MDS + MGS drive ( 4Gb ) 1 OST Drive (76 Gb ) I am sharing this out to the cluster on it''s own network interface that I have specified in the modprobe.conf file As the data we are manipulating ranges between 150Mb and 1.1Gb I have been timing who long it takes to read and write a 1.1gb file Basically Time dd if=/dev/zero of=<lustre fs>/file bs=16k count=65536 And then Time dd if=<lustre_fs>/file of=/dev/zero Now to be honest I am not seeing any difference in Lustre compared with NFS Testing of five nodes Lustre NFS Node File Size Method Count Time Speed Time Speed 1 1.1Gb Writing 65536 1m48s 9.9 Mb/s 1m45s 10.2 Mb/s 2 1.1Gb Writing 65536 1m52S 9.5 Mb/s 1m42s 10.5 Mb/s 3 1.1Gb Writing 65536 1m53s 9.4 Mb/s 1m44s 10.3Mb/s 4 1.1Gb Writing 65536 1m54s 9.4 Mb /s 1m44s 10.2 Mb/s 5 1.1Gb Writing 65536 1m55s 9.3 Mb/s 1m44s 10.3 Mb/s 1 1.1Gb Reading 1m43s 10.4 Mb/s 1m22s 13.2 Mb/s 2 1.1Gb Reading 1m44s 10.3 Mb/s 1m34s 11.4 Mb/s 3 1.1Gb Reading 1m40s 10.6 Mb/s 1m27s 12.3 Mb/s 4 1.1Gb Reading 1m33s 11.4 Mb/s 1m44s 10.2 Mb/s 5 1.1Gb Reading 1m39s 10.7 Mb/s 1m35s 11.2 Mb/s Do I need to tweak anything or is this right ? I started off just testing one node, then 2 and now 5. Thanks Iain _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ SCRI, Invergowrie, Dundee, DD2 5DA. The Scottish Crop Research Institute is a charitable company limited by guarantee. Registered in Scotland No: SC 29367. Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20071029/211169e8/attachment-0002.html
Personally I never use the time to write a file as much of an indication of performance because you never know what was going on between the endpoints of the test. Rather I run collectl http://collectl.sourceforge.net/ one one or most OSS nodes and tell it I want to see lustre data along with CPU and the network. Here''s an example of what collectl would should on an GigE network at the default rate of once a second. There is currently no lustre I/O: [root at ibsfs3 ~]# collectl -scln waiting for 1 second sample... #<-------CPU--------><-----------Network----------><--------Lustre OST-------> #cpu sys inter ctxsw netKBi pkt-in netKBo pkt-out KBRead Reads KBWrit Writes 10 3 1452 1173 30 260 25 304 0 0 0 0 6 3 1092 253 3 40 2 29 0 0 0 0 0 0 1059 63 4 61 2 30 0 0 0 0 1 0 1016 52 0 5 0 3 0 0 0 0 In any event, this should at least allow you to verify that the OSSs are generating data at the expected rates. You really don''t need to look at the network and lustre data since they should be about the same, I always figure it can''t hurt and so include both and if they''re ever not the same something very odd would be going on. Next you can install collectl onto a client and issue the same command. Again you should see the expected load which for striped files will be the load observed above times the number of OSTs. Also the network and lustre rates should be about the same. -mark Iain Grant wrote:> > I have been looking into Lustre as an alternative for our shared file > system on a cluster of about 20+ nodes. > > The configuration I am using for Lustre is a system that has a second > unused sata drive that I have partitioned into > > 1 MDS + MGS drive ( 4Gb ) > > 1 OST Drive (76 Gb ) > > I am sharing this out to the cluster on it?s own network interface > that I have specified in the modprobe.conf file > > As the data we are manipulating ranges between 150Mb and 1.1Gb I have > been timing who long it takes to read and write a 1.1gb file > > Basically > > Time dd if=/dev/zero of=<lustre fs>/file bs=16k count=65536 > > And then > > Time dd if=<lustre_fs>/file of=/dev/zero > > Now to be honest I am not seeing any difference in Lustre compared > with NFS > > Testing of five nodes > > > > > > > > > > > > > > > > > > Lustre > > > > NFS > > Node > > > > File Size > > > > Method > > > > Count > > > > Time > > > > Speed > > > > Time > > > > Speed > > 1 > > > > 1.1Gb > > > > Writing > > > > 65536 > > > > 1m48s > > > > 9.9 Mb/s > > > > 1m45s > > > > 10.2 Mb/s > > 2 > > > > 1.1Gb > > > > Writing > > > > 65536 > > > > 1m52S > > > > 9.5 Mb/s > > > > 1m42s > > > > 10.5 Mb/s > > 3 > > > > 1.1Gb > > > > Writing > > > > 65536 > > > > 1m53s > > > > 9.4 Mb/s > > > > 1m44s > > > > 10.3Mb/s > > 4 > > > > 1.1Gb > > > > Writing > > > > 65536 > > > > 1m54s > > > > 9.4 Mb /s > > > > 1m44s > > > > 10.2 Mb/s > > 5 > > > > 1.1Gb > > > > Writing > > > > 65536 > > > > 1m55s > > > > 9.3 Mb/s > > > > 1m44s > > > > 10.3 Mb/s > > 1 > > > > 1.1Gb > > > > Reading > > > > > > 1m43s > > > > 10.4 Mb/s > > > > 1m22s > > > > 13.2 Mb/s > > 2 > > > > 1.1Gb > > > > Reading > > > > > > 1m44s > > > > 10.3 Mb/s > > > > 1m34s > > > > 11.4 Mb/s > > 3 > > > > 1.1Gb > > > > Reading > > > > > > 1m40s > > > > 10.6 Mb/s > > > > 1m27s > > > > 12.3 Mb/s > > 4 > > > > 1.1Gb > > > > Reading > > > > > > 1m33s > > > > 11.4 Mb/s > > > > 1m44s > > > > 10.2 Mb/s > > 5 > > > > 1.1Gb > > > > Reading > > > > > > 1m39s > > > > 10.7 Mb/s > > > > 1m35s > > > > 11.2 Mb/s > > Do I need to tweak anything or is this right ? > > I started off just testing one node, then 2 and now 5. > > Thanks > > Iain > > > > > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ _ _ > > SCRI, Invergowrie, Dundee, DD2 5DA. > The Scottish Crop Research Institute is a charitable company limited > by guarantee. > Registered in Scotland No: SC 29367. > Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. > > > DISCLAIMER: > > This email is from the Scottish Crop Research Institute, but the views > expressed by the sender are not necessarily the views of SCRI and its > subsidiaries. This email and any files transmitted with it are > confidential > to the intended recipient at the e-mail address to which it has been > addressed. It may not be disclosed or used by any other than that > addressee. > If you are not the intended recipient you are requested to preserve this > confidentiality and you must not use, disclose, copy, print or rely on > this > e-mail in any way. Please notify postmaster at scri.ac.uk quoting the > name of the sender and delete the email from your system. > > Although SCRI has taken reasonable precautions to ensure no viruses are > present in this email, neither the Institute nor the sender accepts any > responsibility for any viruses, and it is your responsibility to scan > the email > and the attachments (if any). > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >
I wouldn?t expect much performance difference between NFS and Lustre if you are only using one OST and you only have one SATA drive on the back end. We did not find much performance improvement until we had all 64 nodes hammering Lustre. Our NFS server was constantly at a load of 8 or higher with one disk and with Lustre we had no load average with four OSTs and four drives. I would look at adding more OSTs and trying the test again, that is where Lustre really shines over NFS. Robert On 10/29/07 6:17 AM, "Iain Grant" <Iain.Grant at scri.ac.uk> wrote:> I have been looking into Lustre as an alternative for our shared file system > on a cluster of about 20+ nodes. > > The configuration I am using for Lustre is a system that has a second unused > sata drive that I have partitioned into > 1 MDS + MGS drive ( 4Gb ) > 1 OST Drive (76 Gb ) > > I am sharing this out to the cluster on it?s own network interface that I have > specified in the modprobe.conf file > > As the data we are manipulating ranges between 150Mb and 1.1Gb I have been > timing who long it takes to read and write a 1.1gb file > > Basically > > Time dd if=/dev/zero of=<lustre fs>/file bs=16k count=65536 > > And then > > Time dd if=<lustre_fs>/file of=/dev/zero > > Now to be honest I am not seeing any difference in Lustre compared with NFS > > > Testing of five nodes > Lustre NFS > Node File Size Method Count Time Speed Time Speed > 1 1.1Gb Writing 65536 1m48s 9.9 Mb/s 1m45s 10.2 Mb/s > 2 1.1Gb Writing 65536 1m52S 9.5 Mb/s 1m42s 10.5 Mb/s > 3 1.1Gb Writing 65536 1m53s 9.4 Mb/s 1m44s 10.3Mb/s > 4 1.1Gb Writing 65536 1m54s 9.4 Mb /s 1m44s 10.2 Mb/s > 5 1.1Gb Writing 65536 1m55s 9.3 Mb/s 1m44s 10.3 Mb/s > 1 1.1Gb Reading 1m43s 10.4 Mb/s 1m22s 13.2 Mb/s > 2 1.1Gb Reading 1m44s 10.3 Mb/s 1m34s 11.4 Mb/s > 3 1.1Gb Reading 1m40s 10.6 Mb/s 1m27s 12.3 Mb/s > 4 1.1Gb Reading 1m33s 11.4 Mb/s 1m44s 10.2 Mb/s > 5 1.1Gb Reading 1m39s 10.7 Mb/s 1m35s 11.2 Mb/s > > Do I need to tweak anything or is this right ? > I started off just testing one node, then 2 and now 5. > > Thanks > > Iain > > > > > > > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > > SCRI, Invergowrie, Dundee, DD2 5DA. > The Scottish Crop Research Institute is a charitable company limited by > guarantee. > Registered in Scotland No: SC 29367. > Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. > > > DISCLAIMER: > > This email is from the Scottish Crop Research Institute, but the views > expressed by the sender are not necessarily the views of SCRI and its > subsidiaries. This email and any files transmitted with it are confidential > to the intended recipient at the e-mail address to which it has been > addressed. It may not be disclosed or used by any other than that addressee. > If you are not the intended recipient you are requested to preserve this > confidentiality and you must not use, disclose, copy, print or rely on this > e-mail in any way. Please notify postmaster at scri.ac.uk quoting the > name of the sender and delete the email from your system. > > Although SCRI has taken reasonable precautions to ensure no viruses are > present in this email, neither the Institute nor the sender accepts any > responsibility for any viruses, and it is your responsibility to scan the > email > and the attachments (if any). > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discussRobert LeBlanc College of Life Sciences Computer Support Brigham Young University leblanc at byu.edu (801)422-1882 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20071029/c552e037/attachment-0002.html
On Mon, 2007-10-29 at 12:17 +0000, Iain Grant wrote:> Now to be honest I am not seeing any difference in Lustre compared > with NFSYou won''t. Lustre''s shining point is not that it''s faster than NFS given a single server and single disk, but rather that it scales incredibly well. Try adding more disks and (when you max out the bandwidth of that single machine''s disk or network -- whichever comes first) add a second (and third and fourth, etc.) OSS. Then try some benchmarks. When you have maxed out the network bandwidth between your client and the Lustre servers, add a second and third, etc. clients and try a collective benchmark across all of the clients. This is where Lustre shines. b.
I have to agree with Brian, the scalability factor is where Lustre really shines. One more thing to add would be to try different stripe sizes. Each application has its own optimal stripe size so experiment with different stripes. On our 19 dual channel bonded GigE OSTs, we see sustained speeds of 200 GB/s when reading the NCBI databases. Brian J. Murrell wrote:> On Mon, 2007-10-29 at 12:17 +0000, Iain Grant wrote: > >> Now to be honest I am not seeing any difference in Lustre compared >> with NFS > > You won''t. Lustre''s shining point is not that it''s faster than NFS > given a single server and single disk, but rather that it scales > incredibly well. > > Try adding more disks and (when you max out the bandwidth of that single > machine''s disk or network -- whichever comes first) add a second (and > third and fourth, etc.) OSS. Then try some benchmarks. > > When you have maxed out the network bandwidth between your client and > the Lustre servers, add a second and third, etc. clients and try a > collective benchmark across all of the clients. > > This is where Lustre shines. > > b. > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >-- Jeremy Mann jeremy at biochem.uthscsa.edu University of Texas Health Science Center Bioinformatics Core Facility http://www.bioinformatics.uthscsa.edu Phone: (210) 567-2672
Not to nitpick but is the 200GBytes/s number correct? That''s pretty good for 19 OSTs. paul Jeremy Mann wrote:> I have to agree with Brian, the scalability factor is where Lustre really > shines. One more thing to add would be to try different stripe sizes. Each > application has its own optimal stripe size so experiment with different > stripes. > > On our 19 dual channel bonded GigE OSTs, we see sustained speeds of 200 > GB/s when reading the NCBI databases. > > Brian J. Murrell wrote: > >> On Mon, 2007-10-29 at 12:17 +0000, Iain Grant wrote: >> >> >>> Now to be honest I am not seeing any difference in Lustre compared >>> with NFS >>> >> You won''t. Lustre''s shining point is not that it''s faster than NFS >> given a single server and single disk, but rather that it scales >> incredibly well. >> >> Try adding more disks and (when you max out the bandwidth of that single >> machine''s disk or network -- whichever comes first) add a second (and >> third and fourth, etc.) OSS. Then try some benchmarks. >> >> When you have maxed out the network bandwidth between your client and >> the Lustre servers, add a second and third, etc. clients and try a >> collective benchmark across all of the clients. >> >> This is where Lustre shines. >> >> b. >> >> >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at clusterfs.com >> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >> >> > > >