Hi all, I have implemented handling of hosts with different PAGE_SIZE in MXLND. I am running tests to make sure that I did not accidentally break something else. So far, I have been using lctl and pinging back and forth as well as with obdecho (using loadgen). When running loadgen tests or lctl test_brw with loadgen''s echosrv running, if I kill a host (either client or server) and bring it back up, LNET seems happy (MXLND reconnects normally), but loadgen does not resume. When using lctl test_brw and I restart a server, the client reconnects but then fails an assertion when it connects to a new server with: LustreError: 6295:0:(echo_client.c:1341:echo_client_cleanup()) ASSERTION(eco->eco_refcount == 0) failed If this the proper way to test a LND? What other methods can you suggest? Thanks, Scott
Scott Atchley wrote:> Hi all, > > I have implemented handling of hosts with different PAGE_SIZE in > MXLND. I am running tests to make sure that I did not accidentally > break something else. So far, I have been using lctl and pinging back > and forth as well as with obdecho (using loadgen). > > When running loadgen tests or lctl test_brw with loadgen''s echosrv > running, if I kill a host (either client or server) and bring it back > up, LNET seems happy (MXLND reconnects normally), but loadgen does not > resume. When using lctl test_brw and I restart a server, the client > reconnects but then fails an assertion when it connects to a new > server with: > > LustreError: 6295:0:(echo_client.c:1341:echo_client_cleanup()) > ASSERTION(eco->eco_refcount == 0) failed > > If this the proper way to test a LND? What other methods can you > suggest? >I''ve been using LNet SelfTest for my LND development and testing. It is ok once you get things working - but it is painful to script around and to trace LST errors back to LND transactions. Nic
Nicholas Henke wrote:> Scott Atchley wrote: >> > > I''ve been using LNet SelfTest for my LND development and testing. It > is ok once you get things working - but it is painful to script around > and to trace LST errors back to LND transactions. > > Nic >FWIW: Here are scripts I''ve been using for LST runs. ''sim_tests.sh'' runs tests in serial, parallel is... The ''sim_config'' is an easy way to manage different machine configs - just set the CONFIG environment variable to the particular file you''d like to use. The cli format is "test size loops" where: - test can be ''read'' ''write'' or ''ping'' - size is anything <= 1m, i.e; ''243k'', ''2342'', ''1m'' - loops is the number to send. There is a ''NCONN'' environment variable that controls the --concurency for LST tests, bascially how many RPCs to keep on the wire. YMMV, Nic -------------- next part -------------- A non-text attachment was scrubbed... Name: sim_tests.sh Type: application/x-sh Size: 1143 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-devel/attachments/20090716/133019f2/attachment.sh -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: sim_config Url: http://lists.lustre.org/pipermail/lustre-devel/attachments/20090716/133019f2/attachment.pl -------------- next part -------------- A non-text attachment was scrubbed... Name: parallel_test.sh Type: application/x-sh Size: 1207 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-devel/attachments/20090716/133019f2/attachment-0001.sh -------------- next part -------------- A non-text attachment was scrubbed... Name: lnet_selftest_framework.sh Type: application/x-sh Size: 6366 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-devel/attachments/20090716/133019f2/attachment-0002.sh