Hello! I have the following problem with lustre: After installation lustre 1.6.1 on SLES 10 from rpm''s on AMD64, everything was fine. But then I''ve tried to test lustre by instantly copying, untarring and then deleting content of big archive (about 2G) in loop. But after 5-th cycle I''ve recieved following messages in my /var/log/messages file: Sep 3 19:44:58 snapxtc3 kernel: Lustre: 3631:0:(ldlm_lib.c:502:target_handle_reconnect()) Skipped 24 previous similar messages Sep 3 19:44:58 snapxtc3 kernel: Lustre: 3631:0:(ldlm_lib.c:724:target_handle_connect()) snapxtfs-OST0000: refuse reconnection from d9f5edc9-a349-4483-5c2f-7fdeaae46d60 at 0@lo to 0xffff81012963d000/3 Sep 3 19:44:58 snapxtc3 kernel: Lustre: 3631:0:(ldlm_lib.c:724:target_handle_connect()) Skipped 24 previous similar messages Sep 3 19:44:58 snapxtc3 kernel: LustreError: 3631:0:(ldlm_lib.c:1395:target_send_reply_msg()) @@@ processing error (-16) req at ffff810131bd8000 x6272801/t0 o8->d9f5edc9-a349-4483-5c2f-7fdeaae46d60 at 212.110.147.59@tcp:-1 lens 304/200 ref 0 fl Interpret:/0/0 rc -16/0 Sep 3 19:44:58 snapxtc3 kernel: LustreError: 3631:0:(ldlm_lib.c:1395:target_send_reply_msg()) Skipped 24 previous similar messages Sep 3 19:44:58 snapxtc3 kernel: LustreError: 11-0: an error ocurred while communicating with (no nid) The ost_connect operation failed with -16 Sep 3 19:44:58 snapxtc3 kernel: LustreError: Skipped 24 previous similar messages Sep 3 19:52:28 snapxtc3 sshd[5441]: Accepted keyboard-interactive/pam for vorl from 212.110.147.40 port 52949 ssh2 Sep 3 19:52:32 snapxtc3 su: (to root) vorl on /dev/pts/6 Sep 3 19:55:23 snapxtc3 kernel: Lustre: 3687:0:(ldlm_lib.c:502:target_handle_reconnect()) snapxtfs-OST0000: d9f5edc9-a349-4483-5c2f-7fdeaae46d60 reconnecting Sep 3 19:55:23 snapxtc3 kernel: Lustre: 3687:0:(ldlm_lib.c:502:target_handle_reconnect()) Skipped 24 previous similar messages Sep 3 19:55:23 snapxtc3 kernel: Lustre: 3687:0:(ldlm_lib.c:724:target_handle_connect()) snapxtfs-OST0000: refuse reconnection from d9f5edc9-a349-4483-5c2f-7fdeaae46d60 at 0@lo to 0xffff81012963d000/3 Sep 3 19:55:23 snapxtc3 kernel: Lustre: 3687:0:(ldlm_lib.c:724:target_handle_connect()) Skipped 24 previous similar messages Sep 3 19:55:23 snapxtc3 kernel: LustreError: 3687:0:(ldlm_lib.c:1395:target_send_reply_msg()) @@@ processing error (-16) req at ffff81000ceff800 x6272901/t0 o8->d9f5edc9-a349-4483-5c2f-7fdeaae46d60 at 212.110.147.59@tcp:-1 lens 304/200 ref 0 fl Interpret:/0/0 rc -16/0 Sep 3 19:55:23 snapxtc3 kernel: LustreError: 3687:0:(ldlm_lib.c:1395:target_send_reply_msg()) Skipped 24 previous similar messages Sep 3 19:55:23 snapxtc3 kernel: LustreError: 11-0: an error ocurred while communicating with (no nid) The ost_connect operation failed with -16 Sep 3 19:55:23 snapxtc3 kernel: LustreError: Skipped 24 previous similar messages and so far... Also if I try to somehow use directory where lustre fs is mounted, I end up by just hanging. For example: ls -l /snapxt (here lustrefs is mounted) ... (my shell is hanging) Also I have a lot of ll_ost_io processes running on my computer 4172 ? S 0:00 [ll_ost_io_309] 4173 ? S 0:00 [ll_ost_io_310] 4174 ? S 0:00 [ll_ost_io_311] 4175 ? S 0:00 [ll_ost_io_312] 4176 ? S 0:00 [ll_ost_io_313] 4177 ? S 0:00 [ll_ost_io_314] 4178 ? S 0:00 [ll_ost_io_315] 4179 ? S 0:00 [ll_ost_io_316] 4180 ? S 0:00 [ll_ost_io_317] 4181 ? S 0:00 [ll_ost_io_318] 4182 ? S 0:00 [ll_ost_io_319] ... Is this lustrefs bug or just consequences of my wrong configuration? -- ????????
Hello! I have the following problem with lustre: After installation lustre 1.6.1 on SLES 10 from rpm''s on AMD64, everything was fine. But then I''ve tried to test lustre by instantly copying, untarring and then deleting content of big archive (about 2G) in loop. But after 5-th cycle I''ve recieved following messages in my /var/log/messages file: Sep 3 19:44:58 snapxtc3 kernel: Lustre: 3631:0:(ldlm_lib.c:502:target_handle_reconnect()) Skipped 24 previous similar messages Sep 3 19:44:58 snapxtc3 kernel: Lustre: 3631:0:(ldlm_lib.c:724:target_handle_connect()) snapxtfs-OST0000: refuse reconnection from d9f5edc9-a349-4483-5c2f-7fdeaae46d60 at 0@lo to 0xffff81012963d000/3 Sep 3 19:44:58 snapxtc3 kernel: Lustre: 3631:0:(ldlm_lib.c:724:target_handle_connect()) Skipped 24 previous similar messages Sep 3 19:44:58 snapxtc3 kernel: LustreError: 3631:0:(ldlm_lib.c:1395:target_send_reply_msg()) @@@ processing error (-16) req at ffff810131bd8000 x6272801/t0 o8->d9f5edc9-a349-4483-5c2f-7fdeaae46d60 at 212.110.147.59@tcp:-1 lens 304/200 ref 0 fl Interpret:/0/0 rc -16/0 Sep 3 19:44:58 snapxtc3 kernel: LustreError: 3631:0:(ldlm_lib.c:1395:target_send_reply_msg()) Skipped 24 previous similar messages Sep 3 19:44:58 snapxtc3 kernel: LustreError: 11-0: an error ocurred while communicating with (no nid) The ost_connect operation failed with -16 Sep 3 19:44:58 snapxtc3 kernel: LustreError: Skipped 24 previous similar messages Sep 3 19:52:28 snapxtc3 sshd[5441]: Accepted keyboard-interactive/pam for vorl from 212.110.147.40 port 52949 ssh2 Sep 3 19:52:32 snapxtc3 su: (to root) vorl on /dev/pts/6 Sep 3 19:55:23 snapxtc3 kernel: Lustre: 3687:0:(ldlm_lib.c:502:target_handle_reconnect()) snapxtfs-OST0000: d9f5edc9-a349-4483-5c2f-7fdeaae46d60 reconnecting Sep 3 19:55:23 snapxtc3 kernel: Lustre: 3687:0:(ldlm_lib.c:502:target_handle_reconnect()) Skipped 24 previous similar messages Sep 3 19:55:23 snapxtc3 kernel: Lustre: 3687:0:(ldlm_lib.c:724:target_handle_connect()) snapxtfs-OST0000: refuse reconnection from d9f5edc9-a349-4483-5c2f-7fdeaae46d60 at 0@lo to 0xffff81012963d000/3 Sep 3 19:55:23 snapxtc3 kernel: Lustre: 3687:0:(ldlm_lib.c:724:target_handle_connect()) Skipped 24 previous similar messages Sep 3 19:55:23 snapxtc3 kernel: LustreError: 3687:0:(ldlm_lib.c:1395:target_send_reply_msg()) @@@ processing error (-16) req at ffff81000ceff800 x6272901/t0 o8->d9f5edc9-a349-4483-5c2f-7fdeaae46d60 at 212.110.147.59@tcp:-1 lens 304/200 ref 0 fl Interpret:/0/0 rc -16/0 Sep 3 19:55:23 snapxtc3 kernel: LustreError: 3687:0:(ldlm_lib.c:1395:target_send_reply_msg()) Skipped 24 previous similar messages Sep 3 19:55:23 snapxtc3 kernel: LustreError: 11-0: an error ocurred while communicating with (no nid) The ost_connect operation failed with -16 Sep 3 19:55:23 snapxtc3 kernel: LustreError: Skipped 24 previous similar messages and so far... Also if I try to somehow use directory where lustre fs is mounted, I end up by just hanging. For example: ls -l /snapxt (here lustrefs is mounted) ... (my shell is hanging) Also I have a lot of ll_ost_io processes running on my computer 4172 ? S 0:00 [ll_ost_io_309] 4173 ? S 0:00 [ll_ost_io_310] 4174 ? S 0:00 [ll_ost_io_311] 4175 ? S 0:00 [ll_ost_io_312] 4176 ? S 0:00 [ll_ost_io_313] 4177 ? S 0:00 [ll_ost_io_314] 4178 ? S 0:00 [ll_ost_io_315] 4179 ? S 0:00 [ll_ost_io_316] 4180 ? S 0:00 [ll_ost_io_317] 4181 ? S 0:00 [ll_ost_io_318] 4182 ? S 0:00 [ll_ost_io_319] ... Is this lustrefs bug or just consequences of my wrong configuration? -- ????????