Sergius ssi. Siczek - sysGen GmbH
2010-Aug-11 14:36 UTC
[Lustre-discuss] LUSTRE Bad Read Performance
Hello, I''ve got a strange problem with lustre read perforamnce. Following setup: 8 Core (100GB RAM) combined MGS/OSS with two OST targets server, with Infiniband 2x RAID6 (11x 1 TB HDD -> 8TB usable on each RAID). (/dev/sdc and /dev/sdd) /dev/sdc1 = 1024GB /dev/sdd1 = 1024GB 1x small HDD for journal and MDS. (/dev/sdb) /dev/sdb1 = 10 GB journal /dev/sdb2 = 10GB journal /dev/sdb3 = 50 GB MDS Procedure: mkfs.lustre --reformat --fsname lustre --mdt --mgs /dev/sdb3 mkfs.lustre --fsname=lustre --reformat --ost --mgsnode 192.168.0.1 at o2ib --mkfsoptions="-E stride=256 -E stripe-width=9 -J device=/dev/sdb1" /dev/sdc1 mkfs.lustre --fsname=lustre --reformat --ost --mgsnode 192.168.0.1 at o2ib --mkfsoptions="-E stride=256 -E stripe-width=9 -J device=/dev/sdb2" /dev/sdd1 Client machine: 24 core (50 GB RAM) with infiniband. Mount : mount -t lustre 192.168.0.1 at o2ib:/lustre /mnt/client/<mailto:192.168.0.1 at o2ib:/lustre%20/mnt/client/> After that I ran some simple throughput tests. Write performance (5 threads) is about 1072 MB/s (which is great), but read performance is very bad ~ 510 MB/s. What could be wrong about this setup? Why write performance is so good, but read so bad? Of course I tested the hardware itself, made a filesystem on the raids (without lustre) and ran the same tests, ~1,3GB/s write 1,7GB/s read (800 GB file) The infiniband is also okay, used the same cards/cable for some HPC tests just hours before, everything good. Nevertheless used the server as a client two (just to bypass the infiniband side as a potential failure). Nearly the same results (write good, read bad). It''s like there is a wall at ~510 MB/s read. Any ideas or suggestions ? Its really urgent. Thank you a lot! 2.6.27.39+lustre1.8.3+0.credativ.lenny.4 Lustre version: 1.8.3-20100410022943-PRISTINE-2.6.27.39+lustre1.8.3+0.credativ.lenny.4 sysGen GmbH Sergius Siczek Am Hallacker 48 ? 28327 Bremen Tel (0421) 4 09 66-32 ? Fax (0421) 4 09 66-33 ssiczek at sysgen.de<mailto:ssiczek at sysgen.de> ? www.sysgen.de<http://www.sysgen.de/> Gesch?ftsf?hrung: Gabriele Nikisch, Anke Hein?en ? Eingetragen beim Amtsgericht Walsrode ? HRB 121943 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100811/c2d41432/attachment-0001.html
My guess would be that your read pattern is very random (so read-ahead isn''t helping) or that you need to tune your read-ahead parameters. A simple test would be to use dd to write/read a file (make sure it''s in the 10s of GB to minimize cache effects) From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Sergius ssi. Siczek - sysGen GmbH Sent: Wednesday, August 11, 2010 10:36 AM To: lustre-discuss at lists.lustre.org Cc: GmbH; Dieter Nikisch - sysGen GmbH; Rolf at lists.lustre.org Subject: [Lustre-discuss] LUSTRE Bad Read Performance Hello, I''ve got a strange problem with lustre read perforamnce. Following setup: 8 Core (100GB RAM) combined MGS/OSS with two OST targets server, with Infiniband 2x RAID6 (11x 1 TB HDD -> 8TB usable on each RAID). (/dev/sdc and /dev/sdd) /dev/sdc1 = 1024GB /dev/sdd1 = 1024GB 1x small HDD for journal and MDS. (/dev/sdb) /dev/sdb1 = 10 GB journal /dev/sdb2 = 10GB journal /dev/sdb3 = 50 GB MDS Procedure: mkfs.lustre --reformat --fsname lustre --mdt --mgs /dev/sdb3 mkfs.lustre --fsname=lustre --reformat --ost --mgsnode 192.168.0.1 at o2ib --mkfsoptions="-E stride=256 -E stripe-width=9 -J device=/dev/sdb1" /dev/sdc1 mkfs.lustre --fsname=lustre --reformat --ost --mgsnode 192.168.0.1 at o2ib --mkfsoptions="-E stride=256 -E stripe-width=9 -J device=/dev/sdb2" /dev/sdd1 Client machine: 24 core (50 GB RAM) with infiniband. Mount : mount -t lustre 192.168.0.1 at o2ib:/lustre /mnt/client/ <mailto:192.168.0.1 at o2ib:/lustre%20/mnt/client/> After that I ran some simple throughput tests. Write performance (5 threads) is about 1072 MB/s (which is great), but read performance is very bad ~ 510 MB/s. What could be wrong about this setup? Why write performance is so good, but read so bad? Of course I tested the hardware itself, made a filesystem on the raids (without lustre) and ran the same tests, ~1,3GB/s write 1,7GB/s read (800 GB file) The infiniband is also okay, used the same cards/cable for some HPC tests just hours before, everything good. Nevertheless used the server as a client two (just to bypass the infiniband side as a potential failure). Nearly the same results (write good, read bad). It''s like there is a wall at ~510 MB/s read. Any ideas or suggestions ? Its really urgent. Thank you a lot! 2.6.27.39+lustre1.8.3+0.credativ.lenny.4 Lustre version: 1.8.3-20100410022943-PRISTINE-2.6.27.39+lustre1.8.3+0.credativ.lenny.4 sysGen GmbH Sergius Siczek Am Hallacker 48 ? 28327 Bremen Tel (0421) 4 09 66-32 ? Fax (0421) 4 09 66-33 ssiczek at sysgen.de ? www.sysgen.de <http://www.sysgen.de/> Gesch?ftsf?hrung: Gabriele Nikisch, Anke Hein?en ? Eingetragen beim Amtsgericht Walsrode ? HRB 121943 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100811/2e2551b4/attachment-0001.html
On Wednesday, August 11, 2010, Sergius ssi. Siczek - sysGen GmbH wrote:> Hello, > > I''ve got a strange problem with lustre read perforamnce.You might want to check lustre bugs 19146, 21958, 23343, 23549 and to sum it up, try to disable checksums first and see if that helps. Cheers, Bernd -- Bernd Schubert DataDirect Networks