Hi all, I have a Lustre 1.8.3 setup where I''d like to some lnet router performance tests with routing between DDR IB<->10GE networks. Currently I have three nodes, one with DDR IB, one with 10GE and one with both that does the routing. A first short lnet test shows 520-550 MB/s performance. Has anyone an idea which of the variables of the lnet module are worth playing with to get this number a bit closer to 1GB/s? parm: tiny_router_buffers:# of 0 payload messages to buffer in the router (int) parm: small_router_buffers:# of small (1 page) messages to buffer in the router (int) parm: large_router_buffers:# of large messages to buffer in the router (int) parm: peer_buffer_credits:# router buffer credits per peer (int) The CPU on the router node is less utilized than it was when I did back to back 10GE tests. I have 6 cores in the machine, 5 have been idle and one showing a load of about 60%. Michael -- Michael Kluge, M.Sc. Technische Universit?t Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: michael.kluge at tu-dresden.de WWW: http://www.tu-dresden.de/zih -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100910/aa42fc56/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 5997 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100910/aa42fc56/attachment.bin
On 2010-09-10, at 08:23, Michael Kluge wrote:> I have a Lustre 1.8.3 setup where I''d like to some lnet router performance tests with routing between DDR IB<->10GE networks. Currently I have three nodes, one with DDR IB, one with 10GE and one with both that does the routing. A first short lnet test shows 520-550 MB/s performance. > > Has anyone an idea which of the variables of the lnet module are worth playing with to get this number a bit closer to 1GB/s?I would start by testing the performance on just the 10GigE side, and then separately on the IB side, to verify you are getting the expected performance from the components before trying them both together. Often it is necessary to tune the ethernet send/receive buffers. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc.
Hi Andreas, Am 10.09.2010 um 16:35 schrieb Andreas Dilger:> On 2010-09-10, at 08:23, Michael Kluge wrote: >> I have a Lustre 1.8.3 setup where I''d like to some lnet router performance tests with routing between DDR IB<->10GE networks. Currently I have three nodes, one with DDR IB, one with 10GE and one with both that does the routing. A first short lnet test shows 520-550 MB/s performance. >> >> Has anyone an idea which of the variables of the lnet module are worth playing with to get this number a bit closer to 1GB/s? > > I would start by testing the performance on just the 10GigE side, and then separately on the IB side, to verify you are getting the expected performance from the components before trying them both together. Often it is necessary to tune the ethernet send/receive buffers.Ethernet back to back is at 950 MB/s. I have not looked at IB back to back yet. Michael -- Michael Kluge, M.Sc. Technische Universit?t Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room WIL A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: michael.kluge at tu-dresden.de WWW: http://www.tu-dresden.de/zih -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100910/fc7afcd5/attachment.html
OK, IB back to back is at 1,2 GB/s, 10GE back to back at 950 MB/s, with additional lnet router I see 550 MB/s. Time for lnet tuning? Michael> Hi Andreas, > > Am 10.09.2010 um 16:35 schrieb Andreas Dilger: > >> On 2010-09-10, at 08:23, Michael Kluge wrote: >>> I have a Lustre 1.8.3 setup where I''d like to some lnet router performance tests with routing between DDR IB<->10GE networks. Currently I have three nodes, one with DDR IB, one with 10GE and one with both that does the routing. A first short lnet test shows 520-550 MB/s performance. >>> >>> Has anyone an idea which of the variables of the lnet module are worth playing with to get this number a bit closer to 1GB/s? >> >> I would start by testing the performance on just the 10GigE side, and then separately on the IB side, to verify you are getting the expected performance from the components before trying them both together. Often it is necessary to tune the ethernet send/receive buffers. > > Ethernet back to back is at 950 MB/s. I have not looked at IB back to back yet. > > > Michael > > -- > > Michael Kluge, M.Sc. > > Technische Universit?t Dresden > Center for Information Services and > High Performance Computing (ZIH) > D-01062 Dresden > Germany > > Contact: > Willersbau, Room WIL A 208 > Phone: (+49) 351 463-34217 > Fax: (+49) 351 463-37773 > e-mail: michael.kluge at tu-dresden.de > WWW: http://www.tu-dresden.de/zih > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-- Michael Kluge, M.Sc. Technische Universit?t Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room WIL A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: michael.kluge at tu-dresden.de WWW: http://www.tu-dresden.de/zih -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100910/40ae6437/attachment.html
And here are my params: root at doss05:/home/tests/lnet# for F in /sys/module/lnet/parameters/* ; do echo -n "$F: "; cat $F ; done /sys/module/lnet/parameters/accept: secure /sys/module/lnet/parameters/accept_backlog: 127 /sys/module/lnet/parameters/accept_port: 988 /sys/module/lnet/parameters/accept_timeout: 5 /sys/module/lnet/parameters/auto_down: 1 /sys/module/lnet/parameters/avoid_asym_router_failure: 0 /sys/module/lnet/parameters/check_routers_before_use: 0 /sys/module/lnet/parameters/config_on_load: 0 /sys/module/lnet/parameters/dead_router_check_interval: 0 /sys/module/lnet/parameters/forwarding: enabled /sys/module/lnet/parameters/ip2nets: /sys/module/lnet/parameters/large_router_buffers: 512 /sys/module/lnet/parameters/live_router_check_interval: 0 /sys/module/lnet/parameters/local_nid_dist_zero: 1 /sys/module/lnet/parameters/networks: tcp0(eth2),o2ib(ib1) /sys/module/lnet/parameters/peer_buffer_credits: 0 /sys/module/lnet/parameters/portals_compatibility: none /sys/module/lnet/parameters/router_ping_timeout: 50 /sys/module/lnet/parameters/routes: /sys/module/lnet/parameters/small_router_buffers: 8192 /sys/module/lnet/parameters/tiny_router_buffers: 1024 I have not used ip2nets but configure routing but put explict routing statements into the modprobe.d/ files. Is that OK? Michael Am 10.09.2010 um 17:48 schrieb Michael Kluge:> OK, IB back to back is at 1,2 GB/s, 10GE back to back at 950 MB/s, with additional lnet router I see 550 MB/s. Time for lnet tuning? > > Michael > >> Hi Andreas, >> >> Am 10.09.2010 um 16:35 schrieb Andreas Dilger: >> >>> On 2010-09-10, at 08:23, Michael Kluge wrote: >>>> I have a Lustre 1.8.3 setup where I''d like to some lnet router performance tests with routing between DDR IB<->10GE networks. Currently I have three nodes, one with DDR IB, one with 10GE and one with both that does the routing. A first short lnet test shows 520-550 MB/s performance. >>>> >>>> Has anyone an idea which of the variables of the lnet module are worth playing with to get this number a bit closer to 1GB/s? >>> >>> I would start by testing the performance on just the 10GigE side, and then separately on the IB side, to verify you are getting the expected performance from the components before trying them both together. Often it is necessary to tune the ethernet send/receive buffers. >> >> Ethernet back to back is at 950 MB/s. I have not looked at IB back to back yet. >> >> >> Michael >> >> -- >> >> Michael Kluge, M.Sc. >> >> Technische Universit?t Dresden >> Center for Information Services and >> High Performance Computing (ZIH) >> D-01062 Dresden >> Germany >> >> Contact: >> Willersbau, Room WIL A 208 >> Phone: (+49) 351 463-34217 >> Fax: (+49) 351 463-37773 >> e-mail: michael.kluge at tu-dresden.de >> WWW: http://www.tu-dresden.de/zih >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > -- > > Michael Kluge, M.Sc. > > Technische Universit?t Dresden > Center for Information Services and > High Performance Computing (ZIH) > D-01062 Dresden > Germany > > Contact: > Willersbau, Room WIL A 208 > Phone: (+49) 351 463-34217 > Fax: (+49) 351 463-37773 > e-mail: michael.kluge at tu-dresden.de > WWW: http://www.tu-dresden.de/zih > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-- Michael Kluge, M.Sc. Technische Universit?t Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room WIL A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: michael.kluge at tu-dresden.de WWW: http://www.tu-dresden.de/zih -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100910/8ebd7481/attachment-0001.html
Has anyone else a 10GE<->IB Lustre router? What are the typical performance numbers? How close do you get to 1GB/s? Michael Am 10.09.2010 17:55, schrieb Michael Kluge:> And here are my params: > > root at doss05:/home/tests/lnet# for F in /sys/module/lnet/parameters/* ; > do echo -n "$F: "; cat $F ; done > /sys/module/lnet/parameters/accept: secure > /sys/module/lnet/parameters/accept_backlog: 127 > /sys/module/lnet/parameters/accept_port: 988 > /sys/module/lnet/parameters/accept_timeout: 5 > /sys/module/lnet/parameters/auto_down: 1 > /sys/module/lnet/parameters/avoid_asym_router_failure: 0 > /sys/module/lnet/parameters/check_routers_before_use: 0 > /sys/module/lnet/parameters/config_on_load: 0 > /sys/module/lnet/parameters/dead_router_check_interval: 0 > /sys/module/lnet/parameters/forwarding: enabled > /sys/module/lnet/parameters/ip2nets: > /sys/module/lnet/parameters/large_router_buffers: 512 > /sys/module/lnet/parameters/live_router_check_interval: 0 > /sys/module/lnet/parameters/local_nid_dist_zero: 1 > /sys/module/lnet/parameters/networks: tcp0(eth2),o2ib(ib1) > /sys/module/lnet/parameters/peer_buffer_credits: 0 > /sys/module/lnet/parameters/portals_compatibility: none > /sys/module/lnet/parameters/router_ping_timeout: 50 > /sys/module/lnet/parameters/routes: > /sys/module/lnet/parameters/small_router_buffers: 8192 > /sys/module/lnet/parameters/tiny_router_buffers: 1024 > > I have not used ip2nets but configure routing but put explict routing > statements into the modprobe.d/ files. Is that OK? > > > Michael > > > Am 10.09.2010 um 17:48 schrieb Michael Kluge: > >> OK, IB back to back is at 1,2 GB/s, 10GE back to back at 950 MB/s, >> with additional lnet router I see 550 MB/s. Time for lnet tuning? >> >> Michael >> >>> Hi Andreas, >>> >>> Am 10.09.2010 um 16:35 schrieb Andreas Dilger: >>> >>>> On 2010-09-10, at 08:23, Michael Kluge wrote: >>>>> I have a Lustre 1.8.3 setup where I''d like to some lnet router >>>>> performance tests with routing between DDR IB<->10GE networks. >>>>> Currently I have three nodes, one with DDR IB, one with 10GE and >>>>> one with both that does the routing. A first short lnet test shows >>>>> 520-550 MB/s performance. >>>>> >>>>> Has anyone an idea which of the variables of the lnet module are >>>>> worth playing with to get this number a bit closer to 1GB/s? >>>> >>>> I would start by testing the performance on just the 10GigE side, >>>> and then separately on the IB side, to verify you are getting the >>>> expected performance from the components before trying them both >>>> together. Often it is necessary to tune the ethernet send/receive >>>> buffers. >>> >>> Ethernet back to back is at 950 MB/s. I have not looked at IB back to >>> back yet. >>> >>> >>> Michael >>> >>> -- >>> >>> Michael Kluge, M.Sc. >>> >>> Technische Universit?t Dresden >>> Center for Information Services and >>> High Performance Computing (ZIH) >>> D-01062 Dresden >>> Germany >>> >>> Contact: >>> Willersbau, Room WIL A 208 >>> Phone: (+49) 351 463-34217 >>> Fax: (+49) 351 463-37773 >>> e-mail: michael.kluge at tu-dresden.de <mailto:michael.kluge at tu-dresden.de> >>> WWW: http://www.tu-dresden.de/zih >>> >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at lists.lustre.org <mailto:Lustre-discuss at lists.lustre.org> >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> >> -- >> >> Michael Kluge, M.Sc. >> >> Technische Universit?t Dresden >> Center for Information Services and >> High Performance Computing (ZIH) >> D-01062 Dresden >> Germany >> >> Contact: >> Willersbau, Room WIL A 208 >> Phone: (+49) 351 463-34217 >> Fax: (+49) 351 463-37773 >> e-mail: michael.kluge at tu-dresden.de <mailto:michael.kluge at tu-dresden.de> >> WWW: http://www.tu-dresden.de/zih >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org <mailto:Lustre-discuss at lists.lustre.org> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > -- > > Michael Kluge, M.Sc. > > Technische Universit?t Dresden > Center for Information Services and > High Performance Computing (ZIH) > D-01062 Dresden > Germany > > Contact: > Willersbau, Room WIL A 208 > Phone: (+49) 351 463-34217 > Fax: (+49) 351 463-37773 > e-mail: michael.kluge at tu-dresden.de <mailto:michael.kluge at tu-dresden.de> > WWW: http://www.tu-dresden.de/zih > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Michael, How are you generating load and measuring the throughput? I?m particularly interested in the number of nodes on each side of the router and how many messages you have in flight between each one. Cheers, Eric From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Michael Kluge Sent: 11 September 2010 12:56 AM To: Michael Kluge Cc: Lustre Diskussionsliste Subject: Re: [Lustre-discuss] lnet router tuning And here are my params: root at doss05:/home/tests/lnet# for F in /sys/module/lnet/parameters/* ; do echo -n "$F: "; cat $F ; done /sys/module/lnet/parameters/accept: secure /sys/module/lnet/parameters/accept_backlog: 127 /sys/module/lnet/parameters/accept_port: 988 /sys/module/lnet/parameters/accept_timeout: 5 /sys/module/lnet/parameters/auto_down: 1 /sys/module/lnet/parameters/avoid_asym_router_failure: 0 /sys/module/lnet/parameters/check_routers_before_use: 0 /sys/module/lnet/parameters/config_on_load: 0 /sys/module/lnet/parameters/dead_router_check_interval: 0 /sys/module/lnet/parameters/forwarding: enabled /sys/module/lnet/parameters/ip2nets: /sys/module/lnet/parameters/large_router_buffers: 512 /sys/module/lnet/parameters/live_router_check_interval: 0 /sys/module/lnet/parameters/local_nid_dist_zero: 1 /sys/module/lnet/parameters/networks: tcp0(eth2),o2ib(ib1) /sys/module/lnet/parameters/peer_buffer_credits: 0 /sys/module/lnet/parameters/portals_compatibility: none /sys/module/lnet/parameters/router_ping_timeout: 50 /sys/module/lnet/parameters/routes: /sys/module/lnet/parameters/small_router_buffers: 8192 /sys/module/lnet/parameters/tiny_router_buffers: 1024 I have not used ip2nets but configure routing but put explict routing statements into the modprobe.d/ files. Is that OK? Michael Am 10.09.2010 um 17:48 schrieb Michael Kluge: OK, IB back to back is at 1,2 GB/s, 10GE back to back at 950 MB/s, with additional lnet router I see 550 MB/s. Time for lnet tuning? Michael Hi Andreas, Am 10.09.2010 um 16:35 schrieb Andreas Dilger: On 2010-09-10, at 08:23, Michael Kluge wrote: I have a Lustre 1.8.3 setup where I''d like to some lnet router performance tests with routing between DDR IB<->10GE networks. Currently I have three nodes, one with DDR IB, one with 10GE and one with both that does the routing. A first short lnet test shows 520-550 MB/s performance. Has anyone an idea which of the variables of the lnet module are worth playing with to get this number a bit closer to 1GB/s? I would start by testing the performance on just the 10GigE side, and then separately on the IB side, to verify you are getting the expected performance from the components before trying them both together. Often it is necessary to tune the ethernet send/receive buffers. Ethernet back to back is at 950 MB/s. I have not looked at IB back to back yet. Michael -- Michael Kluge, M.Sc. Technische Universit?t Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room WIL A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: michael.kluge at tu-dresden.de WWW: http://www.tu-dresden.de/zih _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- Michael Kluge, M.Sc. Technische Universit?t Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room WIL A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: michael.kluge at tu-dresden.de WWW: http://www.tu-dresden.de/zih _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- Michael Kluge, M.Sc. Technische Universit?t Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room WIL A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: michael.kluge at tu-dresden.de WWW: http://www.tu-dresden.de/zih -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100913/5902fffa/attachment.html
Hi Eric, basically right now I have one IB node, one 10GE node and one router node that has both types of network interfaces. I''ve got a small lnet test script on the router node, that does the work: export LST_SESSION=$$ lst new_session rw lst add_group readers 192.168.10.8 at tcp lst add_group writers 10.148.0.94 at o2ib lst add_batch bulk_rw lst add_test --batch bulk_rw --from writers --to readers brw read check=simple size=1M lst run bulk_rw lst stat writers & sleep 30; kill $! lst end_session Is there a way to figure out the messages in flight? I remember to have a "rpc''s in flight" tunable but this is connected to the OSC layer which does not do anything in my case (I think). Michael Am 13.09.2010 um 03:08 schrieb Eric Barton:> > Michael, > > > How are you generating load and measuring the throughput? I?m particularly interested in the number > of nodes on each side of the router and how many messages you have in flight between each one. > > > Cheers, > Eric > > > > > From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Michael Kluge > Sent: 11 September 2010 12:56 AM > To: Michael Kluge > Cc: Lustre Diskussionsliste > Subject: Re: [Lustre-discuss] lnet router tuning > > And here are my params: > > root at doss05:/home/tests/lnet# for F in /sys/module/lnet/parameters/* ; do echo -n "$F: "; cat $F ; done > /sys/module/lnet/parameters/accept: secure > /sys/module/lnet/parameters/accept_backlog: 127 > /sys/module/lnet/parameters/accept_port: 988 > /sys/module/lnet/parameters/accept_timeout: 5 > /sys/module/lnet/parameters/auto_down: 1 > /sys/module/lnet/parameters/avoid_asym_router_failure: 0 > /sys/module/lnet/parameters/check_routers_before_use: 0 > /sys/module/lnet/parameters/config_on_load: 0 > /sys/module/lnet/parameters/dead_router_check_interval: 0 > /sys/module/lnet/parameters/forwarding: enabled > /sys/module/lnet/parameters/ip2nets: > /sys/module/lnet/parameters/large_router_buffers: 512 > /sys/module/lnet/parameters/live_router_check_interval: 0 > /sys/module/lnet/parameters/local_nid_dist_zero: 1 > /sys/module/lnet/parameters/networks: tcp0(eth2),o2ib(ib1) > /sys/module/lnet/parameters/peer_buffer_credits: 0 > /sys/module/lnet/parameters/portals_compatibility: none > /sys/module/lnet/parameters/router_ping_timeout: 50 > /sys/module/lnet/parameters/routes: > /sys/module/lnet/parameters/small_router_buffers: 8192 > /sys/module/lnet/parameters/tiny_router_buffers: 1024 > > I have not used ip2nets but configure routing but put explict routing statements into the modprobe.d/ files. Is that OK? > > > Michael > > > Am 10.09.2010 um 17:48 schrieb Michael Kluge: > > > OK, IB back to back is at 1,2 GB/s, 10GE back to back at 950 MB/s, with additional lnet router I see 550 MB/s. Time for lnet tuning? > > Michael > > > Hi Andreas, > > Am 10.09.2010 um 16:35 schrieb Andreas Dilger: > > > On 2010-09-10, at 08:23, Michael Kluge wrote: > > I have a Lustre 1.8.3 setup where I''d like to some lnet router performance tests with routing between DDR IB<->10GE networks. Currently I have three nodes, one with DDR IB, one with 10GE and one with both that does the routing. A first short lnet test shows 520-550 MB/s performance. > > Has anyone an idea which of the variables of the lnet module are worth playing with to get this number a bit closer to 1GB/s? > > I would start by testing the performance on just the 10GigE side, and then separately on the IB side, to verify you are getting the expected performance from the components before trying them both together. Often it is necessary to tune the ethernet send/receive buffers. > > Ethernet back to back is at 950 MB/s. I have not looked at IB back to back yet. > > > Michael > > -- > > Michael Kluge, M.Sc. > > Technische Universit?t Dresden > Center for Information Services and > High Performance Computing (ZIH) > D-01062 Dresden > Germany > > Contact: > Willersbau, Room WIL A 208 > Phone: (+49) 351 463-34217 > Fax: (+49) 351 463-37773 > e-mail: michael.kluge at tu-dresden.de > WWW: http://www.tu-dresden.de/zih > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > -- > > Michael Kluge, M.Sc. > > Technische Universit?t Dresden > Center for Information Services and > High Performance Computing (ZIH) > D-01062 Dresden > Germany > > Contact: > Willersbau, Room WIL A 208 > Phone: (+49) 351 463-34217 > Fax: (+49) 351 463-37773 > e-mail: michael.kluge at tu-dresden.de > WWW: http://www.tu-dresden.de/zih > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > -- > > Michael Kluge, M.Sc. > > Technische Universit?t Dresden > Center for Information Services and > High Performance Computing (ZIH) > D-01062 Dresden > Germany > > Contact: > Willersbau, Room WIL A 208 > Phone: (+49) 351 463-34217 > Fax: (+49) 351 463-37773 > e-mail: michael.kluge at tu-dresden.de > WWW: http://www.tu-dresden.de/zih > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-- Michael Kluge, M.Sc. Technische Universit?t Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room WIL A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: michael.kluge at tu-dresden.de WWW: http://www.tu-dresden.de/zih -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100913/894a7c25/attachment-0001.html
On 09/13/2010 08:35 AM, Michael Kluge wrote:> Hi Eric, > > basically right now I have one IB node, one 10GE node and one router > node that has both types of network interfaces. > > I''ve got a small lnet test script on the router node, that does the work: > export LST_SESSION=$$ > lst new_session rw > lst add_group readers 192.168.10.8 at tcp > lst add_group writers 10.148.0.94 at o2ib > lst add_batch bulk_rw > lst add_test --batch bulk_rw --from writers --to readers brw read > check=simple size=1M > lst run bulk_rw > lst stat writers & sleep 30; kill $! > lst end_session > > Is there a way to figure out the messages in flight? I remember to have > a "rpc''s in flight" tunable but this is connected to the OSC layer which > does not do anything in my case (I think).If you don''t specify --concurrency to the ''lst add_test'', you get 1 RPC in flight. Nic
On 09/10/2010 10:48 AM, Michael Kluge wrote:> OK, IB back to back is at 1,2 GB/s, 10GE back to back at 950 MB/s, with > additional lnet router I see 550 MB/s. Time for lnet tuning? >This smells like latency - you doubled the number of hops and lost ~50% of your performance. Is your later email correct for this test - a single client->router->server ? I''d try adding higher concurrency or more clients with the same concurrenct to see if you can ameliorate some of that. I''d also benchmark each link separately with higher concurrency to see what the limits are. You should be able to graph concurrency/client count on the X axis and see a nice smooth curve that flattens out to some maximum performance. Cheers, Nic P.S to get more accurate performance numbers, you probably want to run ''lst stat writers 5 & sleep 30''. You might want to take the middle 3 data points and average them, discarding the first and last that could be subject to LST overhead and ramp up/down. FWIW, this is what I do when I''m benchmarking throughput rates. Nic
On 09/13/2010 08:56 AM, Nic Henke wrote:> P.S to get more accurate performance numbers, you probably want to run > ''lst stat writers 5& sleep 30''. You might want to take the middle 3 > data points and average them, discarding the first and last that could > be subject to LST overhead and ramp up/down. FWIW, this is what I do > when I''m benchmarking throughput rates.This should read ''lst stat writers --delay 5 & sleep 30'' There is also a ''--count'' parameter in more recent LNets that would allow you do to ''--delay 5 --count 6'' and avoid the sleep/kill. Nic
Michael, I think you may have only got 1 BRW READ in flight at a time with this script, so I would expect the routed throughput to be getting on for half of direct throughput. Can you try ?--concurrency 8? to simulate the number of I/Os a real client would keep in flight? Cheers, Eric From: Michael Kluge [mailto:michael.kluge at tu-dresden.de] Sent: 13 September 2010 10:35 PM To: Eric Barton Cc: ''Lustre Diskussionsliste'' Subject: Re: [Lustre-discuss] lnet router tuning Hi Eric, basically right now I have one IB node, one 10GE node and one router node that has both types of network interfaces. I''ve got a small lnet test script on the router node, that does the work: export LST_SESSION=$$ lst new_session rw lst add_group readers 192.168.10.8 at tcp lst add_group writers 10.148.0.94 at o2ib lst add_batch bulk_rw lst add_test --batch bulk_rw --from writers --to readers brw read check=simple size=1M lst run bulk_rw lst stat writers & sleep 30; kill $! lst end_session Is there a way to figure out the messages in flight? I remember to have a "rpc''s in flight" tunable but this is connected to the OSC layer which does not do anything in my case (I think). Michael Am 13.09.2010 um 03:08 schrieb Eric Barton: Michael, How are you generating load and measuring the throughput? I?m particularly interested in the number of nodes on each side of the router and how many messages you have in flight between each one. Cheers, Eric From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Michael Kluge Sent: 11 September 2010 12:56 AM To: Michael Kluge Cc: Lustre Diskussionsliste Subject: Re: [Lustre-discuss] lnet router tuning And here are my params: root at doss05:/home/tests/lnet# for F in /sys/module/lnet/parameters/* ; do echo -n "$F: "; cat $F ; done /sys/module/lnet/parameters/accept: secure /sys/module/lnet/parameters/accept_backlog: 127 /sys/module/lnet/parameters/accept_port: 988 /sys/module/lnet/parameters/accept_timeout: 5 /sys/module/lnet/parameters/auto_down: 1 /sys/module/lnet/parameters/avoid_asym_router_failure: 0 /sys/module/lnet/parameters/check_routers_before_use: 0 /sys/module/lnet/parameters/config_on_load: 0 /sys/module/lnet/parameters/dead_router_check_interval: 0 /sys/module/lnet/parameters/forwarding: enabled /sys/module/lnet/parameters/ip2nets: /sys/module/lnet/parameters/large_router_buffers: 512 /sys/module/lnet/parameters/live_router_check_interval: 0 /sys/module/lnet/parameters/local_nid_dist_zero: 1 /sys/module/lnet/parameters/networks: tcp0(eth2),o2ib(ib1) /sys/module/lnet/parameters/peer_buffer_credits: 0 /sys/module/lnet/parameters/portals_compatibility: none /sys/module/lnet/parameters/router_ping_timeout: 50 /sys/module/lnet/parameters/routes: /sys/module/lnet/parameters/small_router_buffers: 8192 /sys/module/lnet/parameters/tiny_router_buffers: 1024 I have not used ip2nets but configure routing but put explict routing statements into the modprobe.d/ files. Is that OK? Michael Am 10.09.2010 um 17:48 schrieb Michael Kluge: OK, IB back to back is at 1,2 GB/s, 10GE back to back at 950 MB/s, with additional lnet router I see 550 MB/s. Time for lnet tuning? Michael Hi Andreas, Am 10.09.2010 um 16:35 schrieb Andreas Dilger: On 2010-09-10, at 08:23, Michael Kluge wrote: I have a Lustre 1.8.3 setup where I''d like to some lnet router performance tests with routing between DDR IB<->10GE networks. Currently I have three nodes, one with DDR IB, one with 10GE and one with both that does the routing. A first short lnet test shows 520-550 MB/s performance. Has anyone an idea which of the variables of the lnet module are worth playing with to get this number a bit closer to 1GB/s? I would start by testing the performance on just the 10GigE side, and then separately on the IB side, to verify you are getting the expected performance from the components before trying them both together. Often it is necessary to tune the ethernet send/receive buffers. Ethernet back to back is at 950 MB/s. I have not looked at IB back to back yet. Michael -- Michael Kluge, M.Sc. Technische Universit?t Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room WIL A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: michael.kluge at tu-dresden.de WWW: http://www.tu-dresden.de/zih _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- Michael Kluge, M.Sc. Technische Universit?t Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room WIL A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: michael.kluge at tu-dresden.de WWW: http://www.tu-dresden.de/zih _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- Michael Kluge, M.Sc. Technische Universit?t Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room WIL A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: michael.kluge at tu-dresden.de WWW: http://www.tu-dresden.de/zih _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- Michael Kluge, M.Sc. Technische Universit?t Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room WIL A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: michael.kluge at tu-dresden.de WWW: http://www.tu-dresden.de/zih -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100913/232b24af/attachment-0001.html
Nic, thanks a lot. That made my day. Michael Am 13.09.2010 um 06:49 schrieb Nic Henke:> On 09/13/2010 08:35 AM, Michael Kluge wrote: >> Hi Eric, >> >> basically right now I have one IB node, one 10GE node and one router >> node that has both types of network interfaces. >> >> I''ve got a small lnet test script on the router node, that does the work: >> export LST_SESSION=$$ >> lst new_session rw >> lst add_group readers 192.168.10.8 at tcp >> lst add_group writers 10.148.0.94 at o2ib >> lst add_batch bulk_rw >> lst add_test --batch bulk_rw --from writers --to readers brw read >> check=simple size=1M >> lst run bulk_rw >> lst stat writers & sleep 30; kill $! >> lst end_session >> >> Is there a way to figure out the messages in flight? I remember to have >> a "rpc''s in flight" tunable but this is connected to the OSC layer which >> does not do anything in my case (I think). > > If you don''t specify --concurrency to the ''lst add_test'', you get 1 RPC > in flight. > > Nic > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >-- Michael Kluge, M.Sc. Technische Universit?t Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room WIL A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: michael.kluge at tu-dresden.de WWW: http://www.tu-dresden.de/zih -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100913/55b6548e/attachment.html
Hi Eric, --concurrency 2 already boosted the performance to 1026 MB/s. I don''t think we''ll get any more out of this :) Thanks a lot, Michael Am 13.09.2010 um 07:55 schrieb Eric Barton:> Michael, > > I think you may have only got 1 BRW READ in flight at a time with this script, > so I would expect the routed throughput to be getting on for half of direct > throughput. Can you try ?--concurrency 8? to simulate the number of I/Os > a real client would keep in flight? > > Cheers, > Eric > > From: Michael Kluge [mailto:michael.kluge at tu-dresden.de] > Sent: 13 September 2010 10:35 PM > To: Eric Barton > Cc: ''Lustre Diskussionsliste'' > Subject: Re: [Lustre-discuss] lnet router tuning > > Hi Eric, > > basically right now I have one IB node, one 10GE node and one router node that has both types of network interfaces. > > I''ve got a small lnet test script on the router node, that does the work: > export LST_SESSION=$$ > lst new_session rw > lst add_group readers 192.168.10.8 at tcp > lst add_group writers 10.148.0.94 at o2ib > lst add_batch bulk_rw > lst add_test --batch bulk_rw --from writers --to readers brw read check=simple size=1M > lst run bulk_rw > lst stat writers & sleep 30; kill $! > lst end_session > > Is there a way to figure out the messages in flight? I remember to have a "rpc''s in flight" tunable but this is connected to the OSC layer which does not do anything in my case (I think). > > > Michael > > > > Am 13.09.2010 um 03:08 schrieb Eric Barton: > > > > Michael, > > > How are you generating load and measuring the throughput? I?m particularly interested in the number > of nodes on each side of the router and how many messages you have in flight between each one. > > > Cheers, > Eric > > > > > From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Michael Kluge > Sent: 11 September 2010 12:56 AM > To: Michael Kluge > Cc: Lustre Diskussionsliste > Subject: Re: [Lustre-discuss] lnet router tuning > > And here are my params: > > root at doss05:/home/tests/lnet# for F in /sys/module/lnet/parameters/* ; do echo -n "$F: "; cat $F ; done > /sys/module/lnet/parameters/accept: secure > /sys/module/lnet/parameters/accept_backlog: 127 > /sys/module/lnet/parameters/accept_port: 988 > /sys/module/lnet/parameters/accept_timeout: 5 > /sys/module/lnet/parameters/auto_down: 1 > /sys/module/lnet/parameters/avoid_asym_router_failure: 0 > /sys/module/lnet/parameters/check_routers_before_use: 0 > /sys/module/lnet/parameters/config_on_load: 0 > /sys/module/lnet/parameters/dead_router_check_interval: 0 > /sys/module/lnet/parameters/forwarding: enabled > /sys/module/lnet/parameters/ip2nets: > /sys/module/lnet/parameters/large_router_buffers: 512 > /sys/module/lnet/parameters/live_router_check_interval: 0 > /sys/module/lnet/parameters/local_nid_dist_zero: 1 > /sys/module/lnet/parameters/networks: tcp0(eth2),o2ib(ib1) > /sys/module/lnet/parameters/peer_buffer_credits: 0 > /sys/module/lnet/parameters/portals_compatibility: none > /sys/module/lnet/parameters/router_ping_timeout: 50 > /sys/module/lnet/parameters/routes: > /sys/module/lnet/parameters/small_router_buffers: 8192 > /sys/module/lnet/parameters/tiny_router_buffers: 1024 > > I have not used ip2nets but configure routing but put explict routing statements into the modprobe.d/ files. Is that OK? > > > Michael > > > Am 10.09.2010 um 17:48 schrieb Michael Kluge: > > > > OK, IB back to back is at 1,2 GB/s, 10GE back to back at 950 MB/s, with additional lnet router I see 550 MB/s. Time for lnet tuning? > > Michael > > > > Hi Andreas, > > Am 10.09.2010 um 16:35 schrieb Andreas Dilger: > > > > On 2010-09-10, at 08:23, Michael Kluge wrote: > > > I have a Lustre 1.8.3 setup where I''d like to some lnet router performance tests with routing between DDR IB<->10GE networks. Currently I have three nodes, one with DDR IB, one with 10GE and one with both that does the routing. A first short lnet test shows 520-550 MB/s performance. > > Has anyone an idea which of the variables of the lnet module are worth playing with to get this number a bit closer to 1GB/s? > > I would start by testing the performance on just the 10GigE side, and then separately on the IB side, to verify you are getting the expected performance from the components before trying them both together. Often it is necessary to tune the ethernet send/receive buffers. > > Ethernet back to back is at 950 MB/s. I have not looked at IB back to back yet. > > > Michael > > -- > > Michael Kluge, M.Sc. > > Technische Universit?t Dresden > Center for Information Services and > High Performance Computing (ZIH) > D-01062 Dresden > Germany > > Contact: > Willersbau, Room WIL A 208 > Phone: (+49) 351 463-34217 > Fax: (+49) 351 463-37773 > e-mail: michael.kluge at tu-dresden.de > WWW: http://www.tu-dresden.de/zih > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > -- > > Michael Kluge, M.Sc. > > Technische Universit?t Dresden > Center for Information Services and > High Performance Computing (ZIH) > D-01062 Dresden > Germany > > Contact: > Willersbau, Room WIL A 208 > Phone: (+49) 351 463-34217 > Fax: (+49) 351 463-37773 > e-mail: michael.kluge at tu-dresden.de > WWW: http://www.tu-dresden.de/zih > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > -- > > Michael Kluge, M.Sc. > > Technische Universit?t Dresden > Center for Information Services and > High Performance Computing (ZIH) > D-01062 Dresden > Germany > > Contact: > Willersbau, Room WIL A 208 > Phone: (+49) 351 463-34217 > Fax: (+49) 351 463-37773 > e-mail: michael.kluge at tu-dresden.de > WWW: http://www.tu-dresden.de/zih > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > -- > > Michael Kluge, M.Sc. > > Technische Universit?t Dresden > Center for Information Services and > High Performance Computing (ZIH) > D-01062 Dresden > Germany > > Contact: > Willersbau, Room WIL A 208 > Phone: (+49) 351 463-34217 > Fax: (+49) 351 463-37773 > e-mail: michael.kluge at tu-dresden.de > WWW: http://www.tu-dresden.de/zih > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-- Michael Kluge, M.Sc. Technische Universit?t Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room WIL A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: michael.kluge at tu-dresden.de WWW: http://www.tu-dresden.de/zih -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100913/ae8c7872/attachment-0001.html