Dear list, After upgrade our OSSs to lustre-1.6.6-2.6.9_67.0.22.EL_lustre.1.6.6smp.x86_64, the phenomenon of frequent crash disappears. However, the two OSS provide low performance: 100MB/s read, and 200MB/s write. I/O wait on client nodes sometimes arrived more than 90%. The OSSs server are attached with 10Gb/s Ethernet, and two 4Gb Express channel disk arrays. Is it a problem caused by 32bit client and 64bit OSS server? Thanks. -------------- Lu Wang Computing Center Insititute of High Energy Physics, China
On Thu, 2008-12-04 at 18:36 +0800, Lu Wang wrote:> Dear list, > > After upgrade our OSSs to lustre-1.6.6-2.6.9_67.0.22.EL_lustre.1.6.6smp.x86_64, the phenomenon of frequent crash disappears.Good. What version did you upgrade from?> However, the two OSS provide low performance: 100MB/s read, and 200MB/s write.What was your read/write performance before the upgrade? Is this 100 and 200 MBs/ measurements measured at the clients or is it a measurement of the disk speed at the OSS?> Is it a problem caused by 32bit client and 64bit OSS server?No. Mixed 32/64 bit clusters should not cause a slow-down. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20081204/f95805fd/attachment-0001.bin
1) upgrade from 1.6.5.1(32bit) 2) yes,disk speed of OSSs by the way, our MDS are still on 1.6.5.1(32bit). Is this a problem? "Note - Lustre clients running on architectures with different endianness are supported. One limitation is that the PAGE_SIZE kernel macro on the client must be as large as the PAGE_SIZE of the server. In particular, ia64 clients with large pages (up to 64kB pages) can run with i386 servers (4kB pages). If you are running i386 clients with ia64 servers, you must compile the ia64 kernel with a 4kB PAGE_SIZE (so the server page size is not larger than the client page size). " when I strace a "cp" from Lustre to local disk, it R/W 4k every time. ..... read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 ..... when I strace a "cp" from Lustre to Lustre, it R/W 22M every time. read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2097152) = 2097152 _llseek(4, 2097152, [2097152], SEEK_CUR) = 0 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2097152) = 2097152 _llseek(4, 2097152, [4194304], SEEK_CUR) = 0 On Thu, 2008-12-04 at 18:36 +0800, Lu Wang wrote:> Dear list, > > After upgrade our OSSs to lustre-1.6.6-2.6.9_67.0.22.EL_lustre.1.6.6smp.x86_64, the phenomenon of frequent crash disappears.Good. What version did you upgrade from?> However, the two OSS provide low performance: 100MB/s read, and 200MB/s write.What was your read/write performance before the upgrade? Is this 100 and 200 MBs/ measurements measured at the clients or is it a measurement of the disk speed at the OSS?> Is it a problem caused by 32bit client and 64bit OSS server?No. Mixed 32/64 bit clusters should not cause a slow-down. b. _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
This is the network traffic graph of one OSS( the other one is similar, since files are stripped). There are some peaks, and I am sure user jobs which generated the peaks had not finished when the performance dropped down. ------------------ Lu Wang 2008-12-05 ------------------------------------------------------------- ????Lu Wang ?????2008-12-05 10:11:58 ????lustre-discuss ??? ???Re: [Lustre-discuss] Low performance of 64bit OSS 1) upgrade from 1.6.5.1(32bit) 2) yes,disk speed of OSSs by the way, our MDS are still on 1.6.5.1(32bit). Is this a problem? "Note - Lustre clients running on architectures with different endianness are supported. One limitation is that the PAGE_SIZE kernel macro on the client must be as large as the PAGE_SIZE of the server. In particular, ia64 clients with large pages (up to 64kB pages) can run with i386 servers (4kB pages). If you are running i386 clients with ia64 servers, you must compile the ia64 kernel with a 4kB PAGE_SIZE (so the server page size is not larger than the client page size). " when I strace a "cp" from Lustre to local disk, it R/W 4k every time. ..... read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 ..... when I strace a "cp" from Lustre to Lustre, it R/W 22M every time. read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2097152) = 2097152 _llseek(4, 2097152, [2097152], SEEK_CUR) = 0 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2097152) = 2097152 _llseek(4, 2097152, [4194304], SEEK_CUR) = 0 On Thu, 2008-12-04 at 18:36 +0800, Lu Wang wrote:> Dear list, > > After upgrade our OSSs to lustre-1.6.6-2.6.9_67.0.22.EL_lustre.1.6.6smp.x86_64, the phenomenon of frequent crash disappears.Good. What version did you upgrade from?> However, the two OSS provide low performance: 100MB/s read, and 200MB/s write.What was your read/write performance before the upgrade? Is this 100 and 200 MBs/ measurements measured at the clients or is it a measurement of the disk speed at the OSS?> Is it a problem caused by 32bit client and 64bit OSS server?No. Mixed 32/64 bit clusters should not cause a slow-down. b. _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -------------- next part -------------- A non-text attachment was scrubbed... Name: network traffic.JPG Type: image/jpeg Size: 101123 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20081205/04d06b3a/attachment-0001.jpe