Murray Smigel
2008-Jun-02 17:40 UTC
[Lustre-discuss] performance issues in simple lustre setup
Hi, I have built a simple lustre setup. MDS and OSS are both on a Centos5 machine using the red hat lustre modified kernel 2.6.18-8.1.14.el5_lustre.1.6.4.1 running OFED-1.3. Lustre is 1.6.4.3. The client machine is Debian running the same kernel and OFED-1.3 and lustre 1.6.4.3. The MDT and OST are both single partitions on the the same disk (yes, I know this is not optimal...) The network uses Mellanox ConnectX HCAs running through a Voltaire ISR2004 switch. The basic RDMA setup seems to work in either direction: murray at nasnu3:/slut$ ib_rdma_bw 192.168.3.50 (Lustre server) 5605: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | iters=1000 | duplex=0 | cma=0 | 5605: Local address: LID 0x07, QPN 0x22004e, PSN 0x323aa6 RKey 0x1a002800 VAddr 0x002aaaaaad6000 5605: Remote address: LID 0x05, QPN 0x8004f, PSN 0x67c28c, RKey 0x8002800 VAddr 0x002aaaab705000 5605: Bandwidth peak (#0 to #985): 1332.53 MB/sec 5605: Bandwidth average: 1332.47 MB/sec 5605: Service Demand peak (#0 to #985): 1462 cycles/KB 5605: Service Demand Avg : 1462 cycles/KB [murray at lusty bin]$ ib_rdma_bw 192.168.3.30 (Lustre client) 3845: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | iters=1000 | duplex=0 | cma=0 | 3845: Local address: LID 0x05, QPN 0xa004f, PSN 0x4f4712 RKey 0xa002800 VAddr 0x002aaaab705000 3845: Remote address: LID 0x07, QPN 0x24004e, PSN 0xa740c1, RKey 0x1c002800 VAddr 0x002aaaaaad6000 3845: Bandwidth peak (#0 to #956): 1533.5 MB/sec 3845: Bandwidth average: 1533.43 MB/sec 3845: Service Demand peak (#0 to #956): 1146 cycles/KB 3845: Service Demand Avg : 1146 cycles/KB Local disk speed on the Lustre server seems fine, as does speed when the Lustre machine writes to the Lustre mounted drive (50-80 MB/s). [murray at lusty slut]$ dd if=/dev/zero of=foo bs=1048576 count=1024 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 13.5875 seconds, 79.0 MB/s Performance of the client machine writing to the Lustre drive is poor (12 MB/s) murray at nasnu3:/slut$ mount -t lustre -l 192.168.3.50 at o2ib:/lusty on /slut type lustre (rw) murray at nasnu3:/slut$ dd if=/dev/zero of=foo bs=1048576 count=1024 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 88.9857 seconds, 12.1 MB/s Similar results from using Bonnie++ for the testing. Any ideas as to what might be going on? Thanks, murray
Isaac Huang
2008-Jun-03 03:28 UTC
[Lustre-discuss] performance issues in simple lustre setup
On Mon, Jun 02, 2008 at 01:40:20PM -0400, Murray Smigel wrote:> Hi, > I have built a simple lustre setup. MDS and OSS are both on a Centos5 > machine using the red hat lustre modified kernel > 2.6.18-8.1.14.el5_lustre.1.6.4.1 running OFED-1.3. Lustre is 1.6.4.3. > > The client machine is Debian running the same kernel and OFED-1.3 and > lustre 1.6.4.3. > > The MDT and OST are both single partitions on the the same disk (yes, > I know this is not optimal...) > The network uses Mellanox ConnectX HCAs running through a Voltaire > ISR2004 switch.What is you Lustre network configurations (i.e. lnet options)? If not sure, what''s the output of ''lctl list_nids'' on the client and the server? Isaac> > The basic RDMA setup seems to work in either direction: > murray at nasnu3:/slut$ ib_rdma_bw 192.168.3.50 (Lustre server) > 5605: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | > iters=1000 | duplex=0 | cma=0 | > 5605: Local address: LID 0x07, QPN 0x22004e, PSN 0x323aa6 RKey > 0x1a002800 VAddr 0x002aaaaaad6000 > 5605: Remote address: LID 0x05, QPN 0x8004f, PSN 0x67c28c, RKey > 0x8002800 VAddr 0x002aaaab705000 > > > 5605: Bandwidth peak (#0 to #985): 1332.53 MB/sec > 5605: Bandwidth average: 1332.47 MB/sec > 5605: Service Demand peak (#0 to #985): 1462 cycles/KB > 5605: Service Demand Avg : 1462 cycles/KB > > [murray at lusty bin]$ ib_rdma_bw 192.168.3.30 (Lustre client) > 3845: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | > iters=1000 | duplex=0 | cma=0 | > 3845: Local address: LID 0x05, QPN 0xa004f, PSN 0x4f4712 RKey > 0xa002800 VAddr 0x002aaaab705000 > 3845: Remote address: LID 0x07, QPN 0x24004e, PSN 0xa740c1, RKey > 0x1c002800 VAddr 0x002aaaaaad6000 > > > 3845: Bandwidth peak (#0 to #956): 1533.5 MB/sec > 3845: Bandwidth average: 1533.43 MB/sec > 3845: Service Demand peak (#0 to #956): 1146 cycles/KB > 3845: Service Demand Avg : 1146 cycles/KB > > Local disk speed on the Lustre server seems fine, as does speed when > the Lustre machine writes > to the Lustre mounted drive (50-80 MB/s). > [murray at lusty slut]$ dd if=/dev/zero of=foo bs=1048576 count=1024 > 1024+0 records in > 1024+0 records out > 1073741824 bytes (1.1 GB) copied, 13.5875 seconds, 79.0 MB/s > > Performance of the client machine writing to the Lustre drive is poor > (12 MB/s) > murray at nasnu3:/slut$ mount -t lustre -l > 192.168.3.50 at o2ib:/lusty on /slut type lustre (rw) > > murray at nasnu3:/slut$ dd if=/dev/zero of=foo bs=1048576 count=1024 > 1024+0 records in > 1024+0 records out > 1073741824 bytes (1.1 GB) copied, 88.9857 seconds, 12.1 MB/s > > Similar results from using Bonnie++ for the testing. > > Any ideas as to what might be going on? > Thanks, > murray > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Murray Smigel
2008-Jun-03 11:39 UTC
[Lustre-discuss] performance issues in simple lustre setup
Isaac Huang wrote:>On Mon, Jun 02, 2008 at 01:40:20PM -0400, Murray Smigel wrote: > > >>Hi, >>I have built a simple lustre setup. MDS and OSS are both on a Centos5 >>machine using the red hat lustre modified kernel >>2.6.18-8.1.14.el5_lustre.1.6.4.1 running OFED-1.3. Lustre is 1.6.4.3. >> >>The client machine is Debian running the same kernel and OFED-1.3 and >>lustre 1.6.4.3. >> >>The MDT and OST are both single partitions on the the same disk (yes, >>I know this is not optimal...) >>The network uses Mellanox ConnectX HCAs running through a Voltaire >>ISR2004 switch. >> >> > >What is you Lustre network configurations (i.e. lnet options)? If not >sure, what''s the output of ''lctl list_nids'' on the client and the >server? > > >** On server: (I have tried both msi_x=1 and 0 with no difference) modprobe.conf --> options lnet networks="tcp,o2ib" options mlx4_core msi_x=0 alias ib0 ib_ipoib alias ib1 ib_ipoib [root at lusty lustre-iokit-1.2]# lctl list_nids 192.168.1.94 at tcp 192.168.3.50 at o2ib On client: /etc/modprobe.d/lustre --> options lnet networks="tcp,o2ib" options mlx4_core msi_x=1 alias ib0 ib_ipoib alias ib1 ib_ipoib nasnu3:/home/murray/lustre-iokit-1.2# lctl list_nids 192.168.1.156 at tcp 192.168.3.30 at o2ib Thanks, murray>Isaac > > > >>The basic RDMA setup seems to work in either direction: >>murray at nasnu3:/slut$ ib_rdma_bw 192.168.3.50 (Lustre server) >>5605: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | >>iters=1000 | duplex=0 | cma=0 | >>5605: Local address: LID 0x07, QPN 0x22004e, PSN 0x323aa6 RKey >>0x1a002800 VAddr 0x002aaaaaad6000 >>5605: Remote address: LID 0x05, QPN 0x8004f, PSN 0x67c28c, RKey >>0x8002800 VAddr 0x002aaaab705000 >> >> >>5605: Bandwidth peak (#0 to #985): 1332.53 MB/sec >>5605: Bandwidth average: 1332.47 MB/sec >>5605: Service Demand peak (#0 to #985): 1462 cycles/KB >>5605: Service Demand Avg : 1462 cycles/KB >> >>[murray at lusty bin]$ ib_rdma_bw 192.168.3.30 (Lustre client) >>3845: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | >>iters=1000 | duplex=0 | cma=0 | >>3845: Local address: LID 0x05, QPN 0xa004f, PSN 0x4f4712 RKey >>0xa002800 VAddr 0x002aaaab705000 >>3845: Remote address: LID 0x07, QPN 0x24004e, PSN 0xa740c1, RKey >>0x1c002800 VAddr 0x002aaaaaad6000 >> >> >>3845: Bandwidth peak (#0 to #956): 1533.5 MB/sec >>3845: Bandwidth average: 1533.43 MB/sec >>3845: Service Demand peak (#0 to #956): 1146 cycles/KB >>3845: Service Demand Avg : 1146 cycles/KB >> >>Local disk speed on the Lustre server seems fine, as does speed when >>the Lustre machine writes >>to the Lustre mounted drive (50-80 MB/s). >>[murray at lusty slut]$ dd if=/dev/zero of=foo bs=1048576 count=1024 >>1024+0 records in >>1024+0 records out >>1073741824 bytes (1.1 GB) copied, 13.5875 seconds, 79.0 MB/s >> >>Performance of the client machine writing to the Lustre drive is poor >>(12 MB/s) >>murray at nasnu3:/slut$ mount -t lustre -l >>192.168.3.50 at o2ib:/lusty on /slut type lustre (rw) >> >>murray at nasnu3:/slut$ dd if=/dev/zero of=foo bs=1048576 count=1024 >>1024+0 records in >>1024+0 records out >>1073741824 bytes (1.1 GB) copied, 88.9857 seconds, 12.1 MB/s >> >>Similar results from using Bonnie++ for the testing. >> >>Any ideas as to what might be going on? >>Thanks, >>murray >> >>_______________________________________________ >>Lustre-discuss mailing list >>Lustre-discuss at lists.lustre.org >>http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080603/da1c6fad/attachment-0001.html
Murray Smigel
2008-Jun-03 12:32 UTC
[Lustre-discuss] performance issues in simple lustre setup
Some additional information on the problem. I tried disconnecting the ethernet connection to the server machine (192.168.1.94) and tried running a disk test on the client (192.168.1.156 via ethernet), writing to what I thought was the IB mounted file system (mount -t lustre 192.168.3.50 at o2ib:/lusty /slut, where 192.168.3.50 is the ip of ib0 on the client machine). Looking at /var/log/syslog I saw: Jun 3 08:18:59 nasnu3 kernel: Lustre: 3625:0:(router.c:167:lnet_notify()) Ignoring prediction from 192.168.1.156 at tcp of 192.168.1.94 at tcp down 23987983711 seconds in the future Jun 3 08:19:19 nasnu3 kernel: Lustre: 3623:0:(router.c:167:lnet_notify()) Ignoring prediction from 192.168.1.156 at tcp of 192.168.1.94 at tcp down 23987983691 seconds in the future Jun 3 08:19:44 nasnu3 kernel: Lustre: 3622:0:(router.c:167:lnet_notify()) Ignoring prediction from 192.168.1.156 at tcp of 192.168.1.94 at tcp down 23987983666 seconds in the future Jun 3 08:20:06 nasnu3 kernel: LustreError: 3660:0:(client.c:975:ptlrpc_expire_one_request()) @@@ timeout (sent at 1212495506, 100s ago) req at ffff8102109a3400 x601951/t0 o400->lusty-MDT0000_UUID at 192.168.1.94@tcp:12 lens 128/128 ref 1 fl Rpc:N/0/0 rc 0/-22 Jun 3 08:20:06 nasnu3 kernel: LustreError: 3660:0:(client.c:975:ptlrpc_expire_one_request()) Skipped 1 previous similar message Jun 3 08:20:06 nasnu3 kernel: Lustre: lusty-MDT0000-mdc-ffff810226207000: Connection to service lusty-MDT0000 via nid 192.168.1.94 at tcp was lost; in progress operations using this service will wait for recovery to complete. Jun 3 08:20:06 nasnu3 kernel: Lustre: Skipped 1 previous similar message Jun 3 08:20:09 nasnu3 kernel: Lustre: 3624:0:(router.c:167:lnet_notify()) Ignoring prediction from 192.168.1.156 at tcp of 192.168.1.94 at tcp down 23987983641 seconds in the future Jun 3 08:20:09 nasnu3 kernel: Lustre: 3624:0:(router.c:167:lnet_notify()) Skipped 1 previous similar message Jun 3 08:21:49 nasnu3 kernel: Lustre: 3623:0:(router.c:167:lnet_notify()) Ignoring prediction from 192.168.1.156 at tcp of 192.168.1.94 at tcp down 23987983541 seconds in the future Jun 3 08:22:53 nasnu3 kernel: LustreError: 7051:0:(mdc_locks.c:423:mdc_finish_enqueue()) ldlm_cli_enqueue: -4 Jun 3 08:22:53 nasnu3 kernel: LustreError: 7051:0:(file.c:2512:ll_inode_revalidate_fini()) failure -4 inode 10407505 Jun 3 08:23:26 nasnu3 kernel: LustreError: 3661:0:(client.c:975:ptlrpc_expire_one_request()) @@@ timeout (sent at 1212495706, 100s ago) req at ffff81021eee5a00 x601971/t0 o38->lusty-MDT0000_UUID at 192.168.1.94@tcp:12 lens 304/328 ref 1 fl Rpc:/0/0 rc 0/-22 Jun 3 08:23:26 nasnu3 kernel: LustreError: 3661:0:(client.c:975:ptlrpc_expire_one_request()) Skipped 10 previous similar messages Jun 3 08:23:26 nasnu3 kernel: LustreError: 167-0: This client was evicted by lusty-MDT0000; in progress operations using this service will fail. Jun 3 08:23:26 nasnu3 kernel: LustreError: Skipped 1 previous similar message Jun 3 08:23:26 nasnu3 kernel: LustreError: 7052:0:(mdc_locks.c:423:mdc_finish_enqueue()) ldlm_cli_enqueue: -5 Jun 3 08:23:26 nasnu3 kernel: LustreError: 7052:0:(mdc_locks.c:423:mdc_finish_enqueue()) Skipped 1 previous similar message Jun 3 08:23:26 nasnu3 kernel: Lustre: lusty-MDT0000-mdc-ffff810226207000: Connection restored to service lusty-MDT0000 using nid 192.168.1.94 at tcp. Jun 3 08:23:26 nasnu3 kernel: Lustre: Skipped 1 previous similar message Jun 3 08:23:26 nasnu3 kernel: LustreError: 7052:0:(file.c:2512:ll_inode_revalidate_fini()) failure -5 inode 10407505 Thanks, murray Isaac Huang wrote:>On Mon, Jun 02, 2008 at 01:40:20PM -0400, Murray Smigel wrote: > > >>Hi, >>I have built a simple lustre setup. MDS and OSS are both on a Centos5 >>machine using the red hat lustre modified kernel >>2.6.18-8.1.14.el5_lustre.1.6.4.1 running OFED-1.3. Lustre is 1.6.4.3. >> >>The client machine is Debian running the same kernel and OFED-1.3 and >>lustre 1.6.4.3. >> >>The MDT and OST are both single partitions on the the same disk (yes, >>I know this is not optimal...) >>The network uses Mellanox ConnectX HCAs running through a Voltaire >>ISR2004 switch. >> >> > >What is you Lustre network configurations (i.e. lnet options)? If not >sure, what''s the output of ''lctl list_nids'' on the client and the >server? > >Isaac > > > >>The basic RDMA setup seems to work in either direction: >>murray at nasnu3:/slut$ ib_rdma_bw 192.168.3.50 (Lustre server) >>5605: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | >>iters=1000 | duplex=0 | cma=0 | >>5605: Local address: LID 0x07, QPN 0x22004e, PSN 0x323aa6 RKey >>0x1a002800 VAddr 0x002aaaaaad6000 >>5605: Remote address: LID 0x05, QPN 0x8004f, PSN 0x67c28c, RKey >>0x8002800 VAddr 0x002aaaab705000 >> >> >>5605: Bandwidth peak (#0 to #985): 1332.53 MB/sec >>5605: Bandwidth average: 1332.47 MB/sec >>5605: Service Demand peak (#0 to #985): 1462 cycles/KB >>5605: Service Demand Avg : 1462 cycles/KB >> >>[murray at lusty bin]$ ib_rdma_bw 192.168.3.30 (Lustre client) >>3845: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | >>iters=1000 | duplex=0 | cma=0 | >>3845: Local address: LID 0x05, QPN 0xa004f, PSN 0x4f4712 RKey >>0xa002800 VAddr 0x002aaaab705000 >>3845: Remote address: LID 0x07, QPN 0x24004e, PSN 0xa740c1, RKey >>0x1c002800 VAddr 0x002aaaaaad6000 >> >> >>3845: Bandwidth peak (#0 to #956): 1533.5 MB/sec >>3845: Bandwidth average: 1533.43 MB/sec >>3845: Service Demand peak (#0 to #956): 1146 cycles/KB >>3845: Service Demand Avg : 1146 cycles/KB >> >>Local disk speed on the Lustre server seems fine, as does speed when >>the Lustre machine writes >>to the Lustre mounted drive (50-80 MB/s). >>[murray at lusty slut]$ dd if=/dev/zero of=foo bs=1048576 count=1024 >>1024+0 records in >>1024+0 records out >>1073741824 bytes (1.1 GB) copied, 13.5875 seconds, 79.0 MB/s >> >>Performance of the client machine writing to the Lustre drive is poor >>(12 MB/s) >>murray at nasnu3:/slut$ mount -t lustre -l >>192.168.3.50 at o2ib:/lusty on /slut type lustre (rw) >> >>murray at nasnu3:/slut$ dd if=/dev/zero of=foo bs=1048576 count=1024 >>1024+0 records in >>1024+0 records out >>1073741824 bytes (1.1 GB) copied, 88.9857 seconds, 12.1 MB/s >> >>Similar results from using Bonnie++ for the testing. >> >>Any ideas as to what might be going on? >>Thanks, >>murray >> >>_______________________________________________ >>Lustre-discuss mailing list >>Lustre-discuss at lists.lustre.org >>http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080603/7c2ff01c/attachment-0001.html
Murray Smigel
2008-Jun-03 13:03 UTC
[Lustre-discuss] performance issues in simple lustre setup
More data. When I disconnect the IB connection on the server (192.168.3.50 at o2ib) I see, on the client: un 3 08:55:30 nasnu3 kernel: LustreError: 3634:0:(events.c:55:request_out_callback()) @@@ type 4, status -5 req at ffff8102109a3200 x602471/t0 o400->MGS at MGC192.168.3.50@o2ib_0:26 lens 128/128 ref 2 fl Rpc:N/0/0 rc 0/-22 Jun 3 08:55:30 nasnu3 kernel: LustreError: 3634:0:(events.c:55:request_out_callback()) Skipped 2 previous similar messages Jun 3 08:55:30 nasnu3 kernel: LustreError: 3660:0:(client.c:975:ptlrpc_expire_one_request()) @@@ network error (sent at 1212497706, 24s ago) req at ffff8102109a3200 x602471/t0 o400->MGS at MGC192.168.3.50@o2ib_0:26 lens 128/128 ref 1 fl Rpc:N/0/0 rc 0/-22 Jun 3 08:55:30 nasnu3 kernel: LustreError: 3660:0:(client.c:975:ptlrpc_expire_one_request()) Skipped 1 previous similar message Jun 3 08:55:30 nasnu3 kernel: LustreError: 166-1: MGC192.168.3.50 at o2ib: Connection to service MGS via nid 192.168.3.50 at o2ib was lost; in progress operations using this service will fail. Jun 3 08:56:21 nasnu3 kernel: LustreError: 3523:0:(events.c:55:request_out_callback()) @@@ type 4, status -113 req at ffff8101fd292800 x602474/t0 o250->MGS at MGC192.168.3.50@o2ib_0:26 lens 304/328 ref 2 fl Rpc:/0/0 rc 0/-22 Jun 3 08:56:21 nasnu3 kernel: LustreError: 3661:0:(client.c:975:ptlrpc_expire_one_request()) @@@ network error (sent at 1212497731, 50s ago) req at ffff8101fd292800 x602474/t0 o250->MGS at MGC192.168.3.50@o2ib_0:26 lens 304/328 ref 1 fl Rpc:/0/0 rc 0/-22 Jun 3 08:57:15 nasnu3 kernel: LustreError: 7061:0:(client.c:519:ptlrpc_import_delay_req()) @@@ IMP_INVALID req at ffff81022d83aa00 x602487/t0 o101->MGS at MGC192.168.3.50@o2ib_0:26 lens 232/240 ref 1 fl Rpc:/0/0 rc 0/0 Jun 3 08:57:15 nasnu3 kernel: LustreError: 7061:0:(client.c:519:ptlrpc_import_delay_req()) Skipped 1 previous similar message With the IB connection gone, the disk test to the lustre directory works and gives its usual 13 MB/s rate. Jun 3 08:57:36 nasnu3 kernel: LustreError: 3523:0:(events.c:55:request_out_callback()) @@@ type 4, status -113 req at ffff810191edfe00 x602481/t0 o250->MGS at MGC192.168.3.50@o2ib_0:26 lens 304/328 ref 1 fl Complete:E/0/0 rc -5/-22 Jun 3 08:57:36 nasnu3 kernel: LustreError: 3661:0:(client.c:975:ptlrpc_expire_one_request()) @@@ network error (sent at 1212497830, 26s ago) req at ffff810235f37e00 x602486/t0 o250->MGS at MGC192.168.3.50@o2ib_0:26 lens 304/328 ref 1 fl Rpc:/0/0 rc 0/-22 Jun 3 08:58:20 nasnu3 kernel: LustreError: 7061:0:(client.c:519:ptlrpc_import_delay_req()) @@@ IMP_INVALID req at ffff810150cf4000 x602749/t0 o101->MGS at MGC192.168.3.50@o2ib_0:26 lens 232/240 ref 1 fl Rpc:/0/0 rc 0/0 Jun 3 08:58:20 nasnu3 kernel: LustreError: 7061:0:(client.c:519:ptlrpc_import_delay_req()) Skipped 1 previous similar message Jun 3 08:58:50 nasnu3 kernel: LustreError: 3523:0:(events.c:55:request_out_callback()) @@@ type 4, status -113 req at ffff8101c07c6600 x602748/t0 o250->MGS at MGC192.168.3.50@o2ib_0:26 lens 304/328 ref 2 fl Rpc:/0/0 rc 0/-22 Jun 3 08:58:50 nasnu3 kernel: LustreError: 3523:0:(events.c:55:request_out_callback()) Skipped 1 previous similar message Reconnecting IB: Jun 3 08:59:15 nasnu3 kernel: Lustre: MGC192.168.3.50 at o2ib: Reactivating import Jun 3 08:59:15 nasnu3 kernel: Lustre: MGC192.168.3.50 at o2ib: Connection restored to service MGS using nid 192.168.3.50 at o2ib. Jun 3 08:59:15 nasnu3 kernel: Lustre: Skipped 1 previous similar message Thanks, murray Isaac Huang wrote:>On Mon, Jun 02, 2008 at 01:40:20PM -0400, Murray Smigel wrote: > > >>Hi, >>I have built a simple lustre setup. MDS and OSS are both on a Centos5 >>machine using the red hat lustre modified kernel >>2.6.18-8.1.14.el5_lustre.1.6.4.1 running OFED-1.3. Lustre is 1.6.4.3. >> >>The client machine is Debian running the same kernel and OFED-1.3 and >>lustre 1.6.4.3. >> >>The MDT and OST are both single partitions on the the same disk (yes, >>I know this is not optimal...) >>The network uses Mellanox ConnectX HCAs running through a Voltaire >>ISR2004 switch. >> >> > >What is you Lustre network configurations (i.e. lnet options)? If not >sure, what''s the output of ''lctl list_nids'' on the client and the >server? > >Isaac > > > >>The basic RDMA setup seems to work in either direction: >>murray at nasnu3:/slut$ ib_rdma_bw 192.168.3.50 (Lustre server) >>5605: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | >>iters=1000 | duplex=0 | cma=0 | >>5605: Local address: LID 0x07, QPN 0x22004e, PSN 0x323aa6 RKey >>0x1a002800 VAddr 0x002aaaaaad6000 >>5605: Remote address: LID 0x05, QPN 0x8004f, PSN 0x67c28c, RKey >>0x8002800 VAddr 0x002aaaab705000 >> >> >>5605: Bandwidth peak (#0 to #985): 1332.53 MB/sec >>5605: Bandwidth average: 1332.47 MB/sec >>5605: Service Demand peak (#0 to #985): 1462 cycles/KB >>5605: Service Demand Avg : 1462 cycles/KB >> >>[murray at lusty bin]$ ib_rdma_bw 192.168.3.30 (Lustre client) >>3845: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | >>iters=1000 | duplex=0 | cma=0 | >>3845: Local address: LID 0x05, QPN 0xa004f, PSN 0x4f4712 RKey >>0xa002800 VAddr 0x002aaaab705000 >>3845: Remote address: LID 0x07, QPN 0x24004e, PSN 0xa740c1, RKey >>0x1c002800 VAddr 0x002aaaaaad6000 >> >> >>3845: Bandwidth peak (#0 to #956): 1533.5 MB/sec >>3845: Bandwidth average: 1533.43 MB/sec >>3845: Service Demand peak (#0 to #956): 1146 cycles/KB >>3845: Service Demand Avg : 1146 cycles/KB >> >>Local disk speed on the Lustre server seems fine, as does speed when >>the Lustre machine writes >>to the Lustre mounted drive (50-80 MB/s). >>[murray at lusty slut]$ dd if=/dev/zero of=foo bs=1048576 count=1024 >>1024+0 records in >>1024+0 records out >>1073741824 bytes (1.1 GB) copied, 13.5875 seconds, 79.0 MB/s >> >>Performance of the client machine writing to the Lustre drive is poor >>(12 MB/s) >>murray at nasnu3:/slut$ mount -t lustre -l >>192.168.3.50 at o2ib:/lusty on /slut type lustre (rw) >> >>murray at nasnu3:/slut$ dd if=/dev/zero of=foo bs=1048576 count=1024 >>1024+0 records in >>1024+0 records out >>1073741824 bytes (1.1 GB) copied, 88.9857 seconds, 12.1 MB/s >> >>Similar results from using Bonnie++ for the testing. >> >>Any ideas as to what might be going on? >>Thanks, >>murray >> >>_______________________________________________ >>Lustre-discuss mailing list >>Lustre-discuss at lists.lustre.org >>http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080603/06b93a18/attachment.html
Isaac Huang
2008-Jun-03 14:07 UTC
[Lustre-discuss] performance issues in simple lustre setup
Since both the configuration and the IB link bandwidth looked fine, I''d suggest to measure lnet throughput by lnet selftest: 1. On both client and server: modprobe lnet_selftest 2. On the client: export LST_SESSION=$$ lst new_session --timeo 100000 test lst add_group s 192.168.3.50 at o2ib lst add_group c 192.168.3.30 at o2ib lst add_batch bw lst add_test --batch bw --loop 500 --concurrency 8 --distribute 1:1 --from c --to s brw write size=1M lst run bw lst stat c Isaac On Tue, Jun 03, 2008 at 07:39:44AM -0400, Murray Smigel wrote:> ...... > ** > On server: (I have tried both msi_x=1 and 0 with no difference) > modprobe.conf --> > options lnet networks="tcp,o2ib" > options mlx4_core msi_x=0 > alias ib0 ib_ipoib > alias ib1 ib_ipoib > [root at lusty lustre-iokit-1.2]# lctl list_nids > 192.168.1.94 at tcp > 192.168.3.50 at o2ib > On client: > /etc/modprobe.d/lustre --> > options lnet networks="tcp,o2ib" > options mlx4_core msi_x=1 > alias ib0 ib_ipoib > alias ib1 ib_ipoib > nasnu3:/home/murray/lustre-iokit-1.2# lctl list_nids > 192.168.1.156 at tcp > 192.168.3.30 at o2ib > Thanks, > murray
Isaac Huang
2008-Jun-03 14:15 UTC
[Lustre-discuss] performance issues in simple lustre setup
On Tue, Jun 03, 2008 at 08:32:09AM -0400, Murray Smigel wrote:> Some additional information on the problem. I tried disconnecting the > ethernet connection to > the server machine (192.168.1.94) and tried running a disk test on the > client (192.168.1.156 via ethernet), writing to > what I thought was the IB mounted file system (mount -t lustre > 192.168.3.50 at o2ib:/lusty /slut, where 192.168.3.50 is > the ip of ib0 on the client machine). > Looking at /var/log/syslog I saw: > Jun 3 08:18:59 nasnu3 kernel: Lustre: > 3625:0:(router.c:167:lnet_notify()) Ignoring prediction from > 192.168.1.156 at tcp of 192.168.1.94 at tcp down 23987983711 seconds in the > futureKnown bug, and already fixed: https://bugzilla.lustre.org/show_bug.cgi?id=14300> ...... > lusty-MDT0000-mdc-ffff810226207000: Connection restored to service > lusty-MDT0000 using nid 192.168.1.94 at tcp.It seemed to me that Lustre was actually using the @tcp network, which could be a Lustre configuration issue. Isaac
nasnu3:/slut# lst run bw;lst stat c bw is running now [LNet Rates of c] [W] Avg: 800 RPC/s Min: 800 RPC/s Max: 800 RPC/s [R] Avg: 1596 RPC/s Min: 1596 RPC/s Max: 1596 RPC/s [LNet Bandwidth of c] [W] Avg: 0.13 MB/s Min: 0.13 MB/s Max: 0.13 MB/s [R] Avg: 798.09 MB/s Min: 798.09 MB/s Max: 798.09 MB/s [LNet Rates of c] [W] Avg: 0 RPC/s Min: 0 RPC/s Max: 0 RPC/s [R] Avg: 0 RPC/s Min: 0 RPC/s Max: 0 RPC/s Thanks, murray Isaac Huang wrote:> Since both the configuration and the IB link bandwidth looked fine, > I''d suggest to measure lnet throughput by lnet selftest: > > 1. On both client and server: modprobe lnet_selftest > 2. On the client: > export LST_SESSION=$$ > lst new_session --timeo 100000 test > lst add_group s 192.168.3.50 at o2ib > lst add_group c 192.168.3.30 at o2ib > lst add_batch bw > lst add_test --batch bw --loop 500 --concurrency 8 --distribute 1:1 --from c --to s brw write size=1M > lst run bw > lst stat c > > Isaac > > On Tue, Jun 03, 2008 at 07:39:44AM -0400, Murray Smigel wrote: > > ...... > > ** > > On server: (I have tried both msi_x=1 and 0 with no difference) > > modprobe.conf --> > > options lnet networks="tcp,o2ib" > > options mlx4_core msi_x=0 > > alias ib0 ib_ipoib > > alias ib1 ib_ipoib > > [root at lusty lustre-iokit-1.2]# lctl list_nids > > 192.168.1.94 at tcp > > 192.168.3.50 at o2ib > > On client: > > /etc/modprobe.d/lustre --> > > options lnet networks="tcp,o2ib" > > options mlx4_core msi_x=1 > > alias ib0 ib_ipoib > > alias ib1 ib_ipoib > > nasnu3:/home/murray/lustre-iokit-1.2# lctl list_nids > > 192.168.1.156 at tcp > > 192.168.3.30 at o2ib > > Thanks, > > murray > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
OK, I have a work around. I removed the tcp option from the lnet configuration so that it now reads: options lnet networks="o2ib" options mlx4_core msi_x=1 alias ib0 ib_ipoib alias ib1 ib_ipoib Now I am seeing write bw > 70 MB/sec. Thanks, murray Murray Smigel wrote:> Hi, > I have built a simple lustre setup. MDS and OSS are both on a Centos5 > machine using the red hat lustre modified kernel > 2.6.18-8.1.14.el5_lustre.1.6.4.1 running OFED-1.3. Lustre is 1.6.4.3. > > The client machine is Debian running the same kernel and OFED-1.3 and > lustre 1.6.4.3. > > The MDT and OST are both single partitions on the the same disk (yes, > I know this is not optimal...) > The network uses Mellanox ConnectX HCAs running through a Voltaire > ISR2004 switch. > > The basic RDMA setup seems to work in either direction: > murray at nasnu3:/slut$ ib_rdma_bw 192.168.3.50 (Lustre server) > 5605: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | > iters=1000 | duplex=0 | cma=0 | > 5605: Local address: LID 0x07, QPN 0x22004e, PSN 0x323aa6 RKey > 0x1a002800 VAddr 0x002aaaaaad6000 > 5605: Remote address: LID 0x05, QPN 0x8004f, PSN 0x67c28c, RKey > 0x8002800 VAddr 0x002aaaab705000 > > > 5605: Bandwidth peak (#0 to #985): 1332.53 MB/sec > 5605: Bandwidth average: 1332.47 MB/sec > 5605: Service Demand peak (#0 to #985): 1462 cycles/KB > 5605: Service Demand Avg : 1462 cycles/KB > > [murray at lusty bin]$ ib_rdma_bw 192.168.3.30 (Lustre client) > 3845: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | > iters=1000 | duplex=0 | cma=0 | > 3845: Local address: LID 0x05, QPN 0xa004f, PSN 0x4f4712 RKey > 0xa002800 VAddr 0x002aaaab705000 > 3845: Remote address: LID 0x07, QPN 0x24004e, PSN 0xa740c1, RKey > 0x1c002800 VAddr 0x002aaaaaad6000 > > > 3845: Bandwidth peak (#0 to #956): 1533.5 MB/sec > 3845: Bandwidth average: 1533.43 MB/sec > 3845: Service Demand peak (#0 to #956): 1146 cycles/KB > 3845: Service Demand Avg : 1146 cycles/KB > > Local disk speed on the Lustre server seems fine, as does speed when > the Lustre machine writes > to the Lustre mounted drive (50-80 MB/s). > [murray at lusty slut]$ dd if=/dev/zero of=foo bs=1048576 count=1024 > 1024+0 records in > 1024+0 records out > 1073741824 bytes (1.1 GB) copied, 13.5875 seconds, 79.0 MB/s > > Performance of the client machine writing to the Lustre drive is poor > (12 MB/s) > murray at nasnu3:/slut$ mount -t lustre -l > 192.168.3.50 at o2ib:/lusty on /slut type lustre (rw) > > murray at nasnu3:/slut$ dd if=/dev/zero of=foo bs=1048576 count=1024 > 1024+0 records in > 1024+0 records out > 1073741824 bytes (1.1 GB) copied, 88.9857 seconds, 12.1 MB/s > > Similar results from using Bonnie++ for the testing. > > Any ideas as to what might be going on? > Thanks, > murray > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss