James Robnett
2009-Oct-29 23:00 UTC
[Lustre-discuss] Sgpdd-survey (sgp_dd) memory allocation error
I''m having a problem with sgpdd-survey on a raid array returning: Thu Oct 29 11:19:29 MDT 2009 sgpdd-survey on /dev/sdb from lustre-oss-3 total_size 8388608K rsz 1024 crg 1 thr 1 write 1 failed read 1 failed for all tests. In addition the details file shows: ==============> total_size 8388608K rsz 1024 crg 1 thr 1 =====> write sg starting out command at "sgp_dd.c":872: Cannot allocate memory An example sgp_dd call that the survey is making is: sgp_dd if=/dev/zero of=/dev/sg1 seek=1024 thr=1 count=16777216 bs=512 bpt=2048 time=1 sg starting out command at "sgp_dd.c":872: Cannot allocate memory The same command using sg_dd (sans thre arg) instead of sgp_dd works: sg_dd if=/dev/zero of=/dev/sg1 seek=1024 count=16777216 bs=512 bpt=2048 time=1 Reducing write to 256 blocks per loop time to transfer data: 31.908034 secs at 269.24 MB/sec 16779008+0 records in 16777216+0 records out sgp_dd with a thread count of 1 and block/transaction size of 2048 won''t work with a count greater than 256 on this system. sgp_dd if=/dev/zero of=/dev/sg1 seek=1024 thr=1 count=257 bs=512 bpt=1024 time=1 sg starting out command at "sgp_dd.c":872: Cannot allocate memory sgp_dd if=/dev/zero of=/dev/sg1 seek=1024 thr=1 count=256 bs=512 bpt=1024 time=1 time to transfer data was 0.000660 secs, 198.59 MB/sec 256+0 records in 256+0 records out The same command run on a different machine against a single 1TB disk running the exact same OS/kernel but with 8GB of memory works just fine. No equivalent limits. Further, on the machine in question sg_dd first responds with "Reducing write to 256 blocks per copy" right off the bat. The same threshold for sgp_dd working or not working. The different machine does not generate that message with sg_dd when using a large count size. Any insight as to what causes the memory allocation problem with sgp_dd but not with sg_dd (not a ulimit issue) on this hardware but not on others or why sg_dd can deal with that difference but sgp_dd can''t ? For what it''s worth this OST was part of a 4 OSS filesystem that works just fine. I''m just using it and the others to test some application software and wanted to revisit some of the benchmarks. It''s not a critical issue by any means, this particular benchmark isn''t useful in this case, just terminally curious. James Robnett NRAO/NM Iokit version 1.2 RHEL 5.3 w/ 2.6.18-128.7.1.el5 (and a lustre kernel of same version). sg3_utils-1.25-1.el5 4GB Memory 3ware 9550SX 8-port raid controller SATA 400GB WD disks.