Dear All, Yesterday evening or cluster has stopped. Two of our nodes tried to take the resource from each other, they haven''t seen the other side, if I saw well. I stopped heartbeat, resources, start it again, and back to online, worked fine. This morning I saw this in logs: Feb 22 03:25:07 node4 kernel: Lustre: 7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall /usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.139@ tcp,down,1203647043 Feb 22 03:25:16 node4 kernel: Lustre: 7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall /usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.15 at t cp,down,1203647045 Feb 22 03:25:17 node4 kernel: Lustre: 7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall /usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.17 at t cp,down,1203647044 Feb 22 03:25:24 node4 kernel: Lustre: 7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall /usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.179@ tcp,down,1203647064 Feb 22 03:25:24 node4 kernel: Lustre: 7:0:(linux-debug.c:98:libcfs_run_upcall()) Skipped 2 previous similar messages Feb 22 03:25:29 node4 kernel: Lustre: 7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall /usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.11 at t cp,down,1203647123 Feb 22 03:25:33 node4 kernel: LustreError: 4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request from 192.168.0.13 Feb 22 03:25:43 node4 kernel: LustreError: 4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request from 192.168.0.17 Feb 22 03:25:59 node4 kernel: LustreError: 4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request from 192.168.0.13 Feb 22 03:26:04 node4 kernel: LustreError: 4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request from 192.168.0.179 Feb 22 03:26:04 node4 kernel: LustreError: 4564:0:(socklnd_cb.c:2160:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.0.139 Feb 22 03:26:09 node4 kernel: LustreError: 4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request from 192.168.0.120 Feb 22 03:26:13 node4 kernel: LustreError: 4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request from 192.168.0.11 Feb 22 03:26:14 node4 kernel: Lustre: 4816:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 2a02ce4a-c2cf-36f6-1cf1-82a5c4b22459 reconnecting Feb 22 03:26:14 node4 kernel: Lustre: 4671:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 3e64ed95-8693-9c34-a32e-b803bda9017c reconnecting Feb 22 03:26:29 node4 kernel: Lustre: 4750:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: c286201b-ac3e-07d6-a17b-985129e6b10d reconnecting Feb 22 03:26:32 node4 kernel: Lustre: 4675:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 3b5cafac-fa5a-1040-3749-3b9530401684 reconnecting Feb 22 03:26:35 node4 kernel: Lustre: 4665:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 9220135a-ffbc-0c99-6187-eb7c05c7e008 reconnecting Feb 22 03:26:36 node4 kernel: Lustre: 4785:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: e3dad450-aa12-6959-a62b-48dd320936ff reconnecting Feb 22 03:26:43 node4 kernel: Lustre: 4795:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 9762fef8-bb47-ca87-d2cd-7c439607c523 reconnecting Feb 22 03:26:44 node4 kernel: Lustre: 4814:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 9c0b2b34-745e-23f2-dd10-3a60add8b9b5 reconnecting Feb 22 03:26:48 node4 kernel: Lustre: 4821:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 6c2b9a02-028e-f8bb-3cd2-aa10433721ce reconnecting Feb 22 03:26:50 node4 kernel: Lustre: 4781:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 87c70d18-3f76-7ba3-7c88-1c171e6acb08 reconnecting Feb 22 03:26:54 node4 kernel: Lustre: 4738:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 5b1bc354-6528-1343-0f2b-6a449c0cfe3e reconnecting Feb 22 03:26:58 node4 kernel: Lustre: 4819:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: bf28f4b2-f9aa-5d83-a1a3-84964a8b525c reconnecting Feb 22 03:27:11 node4 kernel: Lustre: 4769:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 948dcff3-b2da-b501-c2fc-3b9fcf85115b reconnecting Feb 22 03:27:16 node4 kernel: Lustre: 4659:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 8100747d-014b-deba-dd95-23973440bc17 reconnecting Feb 22 03:27:38 node4 kernel: Lustre: 4655:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: f1ba7827-0ffe-69e3-3809-e602b55aab49 reconnecting Feb 22 04:00:50 node4 kernel: Lustre: 6:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall /usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.12 at t cp,down,1203649199 Feb 22 04:00:50 node4 kernel: Lustre: 6:0:(linux-debug.c:98:libcfs_run_upcall()) Skipped 1 previous similar message Feb 22 04:00:51 node4 kernel: Lustre: 6:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall /usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.120@ tcp,down,1203649172 Feb 22 04:01:01 node4 kernel: LustreError: 4562:0:(socklnd_cb.c:2160:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.0.179 Feb 22 04:01:04 node4 kernel: Lustre: 7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall /usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.22 at t cp,down,1203649206 Feb 22 04:01:04 node4 kernel: Lustre: 7:0:(linux-debug.c:98:libcfs_run_upcall()) Skipped 1 previous similar message Feb 22 04:01:20 node4 kernel: LustreError: 4563:0:(socklnd_cb.c:2160:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.0.11 On the other side: Feb 22 03:25:46 node3 kernel: LustreError: 16228:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID ''hallmark-OST0004_UUID'' is not available for connect (n o target) req at e8c5d800 x79341/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc 0/0 Feb 22 03:25:46 node3 kernel: LustreError: 16228:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 14 previous similar messages Feb 22 03:25:46 node3 kernel: LustreError: 16228:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error (-19) req at e8c5d800 x79341/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0 Feb 22 03:25:46 node3 kernel: LustreError: 16228:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 30 previous similar messages Feb 22 03:26:06 node3 kernel: LustreError: 16609:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID ''hallmark-OST0004_UUID'' is not available for connect (n o target) req at f7fcec2c x300361/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc 0/0 Feb 22 03:26:06 node3 kernel: LustreError: 16609:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error (-19) req at f7fcec2c x300361/t0 o8-><?>@<?>:- 1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0 Feb 22 03:26:49 node3 kernel: LustreError: 16606:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID ''hallmark-OST0004_UUID'' is not available for connect (n o target) req at c19f0400 x357320/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc 0/0 Feb 22 03:26:49 node3 kernel: LustreError: 16606:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 9 previous similar messages Feb 22 03:26:49 node3 kernel: LustreError: 16606:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error (-19) req at c19f0400 x357320/t0 o8-><?>@<?>:- 1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0 Feb 22 03:26:49 node3 kernel: LustreError: 16606:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 9 previous similar messages Feb 22 04:01:30 node3 kernel: LustreError: 16228:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID ''hallmark-OST0004_UUID'' is not available for connect (n o target) req at d607e200 x301042/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc 0/0 Feb 22 04:01:30 node3 kernel: LustreError: 16228:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 2 previous similar messages Feb 22 04:01:30 node3 kernel: LustreError: 16228:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error (-19) req at d607e200 x301042/t0 o8-><?>@<?>:- 1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0 Feb 22 04:01:30 node3 kernel: LustreError: 16228:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 2 previous similar messages Feb 22 04:01:45 node3 kernel: LustreError: 16610:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID ''hallmark-OST0004_UUID'' is not available for connect (n o target) req at c1b13a00 x127933/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc 0/0 Feb 22 04:01:45 node3 kernel: LustreError: 16610:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 4 previous similar messages Feb 22 04:01:45 node3 kernel: LustreError: 16610:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error (-19) req at c1b13a00 x127933/t0 o8-><?>@<?>:- 1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0 Feb 22 04:01:45 node3 kernel: LustreError: 16610:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 4 previous similar messages And so on, couple of time. After that: Feb 22 11:16:20 node4 kernel: Lustre: hallmark-OST0004: haven''t heard from client 11e65f33-019b-c3cc-17d9-2ccf559a86cd (at 192.168.0.173 at tcp) in 227 seconds. I think it''s dead, and I am evicting it. Feb 22 11:19:12 node4 kernel: Lustre: 7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall /usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.183@ tcp,down,1203675510 Feb 22 11:19:13 node4 kernel: Lustre: 7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall /usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.187@ tcp,down,1203675501 Feb 22 11:19:21 node4 kernel: Lustre: 7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall /usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.150@ tcp,down,1203675540 Feb 22 11:19:25 node4 kernel: Lustre: 7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall /usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.184@ tcp,down,1203675509 Feb 22 11:19:31 node4 kernel: Lustre: 7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall /usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.130@ tcp,down,1203675493 Feb 22 11:19:38 node4 kernel: LustreError: 4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request from 192.168.0.139 Feb 22 11:19:41 node4 kernel: Lustre: 7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall /usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.106@ tcp,down,1203675499 Feb 22 11:19:43 node4 kernel: LustreError: 4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request from 192.168.0.16 Feb 22 11:19:48 node4 kernel: LustreError: 4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request from 192.168.0.12 Feb 22 11:19:53 node4 kernel: LustreError: 4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request from 192.168.0.68 Feb 22 11:19:58 node4 kernel: LustreError: 4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request from 192.168.0.187 Feb 22 11:20:03 node4 kernel: LustreError: 4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request from 192.168.0.183 Feb 22 11:20:10 node4 kernel: LustreError: 4563:0:(socklnd_cb.c:2160:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.0.166 Feb 22 11:20:12 node4 kernel: LustreError: 4562:0:(socklnd_cb.c:2160:ksocknal_recv_hello()) Error -11 reading HELLO from 192.168.0.17 Feb 22 11:20:12 node4 kernel: LustreError: 4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request from 192.168.0.168 Feb 22 11:20:17 node4 kernel: LustreError: 4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request from 192.168.0.139 Feb 22 11:20:22 node4 kernel: LustreError: 4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request from 192.168.0.18 Feb 22 11:20:27 node4 kernel: LustreError: 4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request from 192.168.0.68 Feb 22 11:20:32 node4 kernel: LustreError: 4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request from 192.168.0.138 Feb 22 11:20:42 node4 kernel: LustreError: 4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request from 192.168.0.112 Feb 22 11:20:42 node4 kernel: LustreError: 4567:0:(acceptor.c:442:lnet_acceptor()) Skipped 1 previous similar message Feb 22 11:20:47 node4 kernel: Lustre: 4810:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 88e387a4-d83e-de76-51e7-6db0118d556e reconnecting Feb 22 11:20:53 node4 kernel: Lustre: 4749:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 152f0c05-d8cd-99d2-7d79-248cf7c45cf2 reconnecting Feb 22 11:20:55 node4 kernel: Lustre: 4789:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: f1ba7827-0ffe-69e3-3809-e602b55aab49 reconnecting Feb 22 11:20:55 node4 kernel: Lustre: 4789:0:(ldlm_lib.c:497:target_handle_reconnect()) Skipped 11 previous similar messages Feb 22 11:20:56 node4 kernel: Lustre: 4680:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 72c636f0-a9e8-646c-2052-94898f85d173 reconnecting Feb 22 11:20:57 node4 kernel: Lustre: 4758:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 87c70d18-3f76-7ba3-7c88-1c171e6acb08 reconnecting Feb 22 11:20:59 node4 kernel: Lustre: 4815:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 092b01c8-de04-7b82-e833-f00921db6dce reconnecting Feb 22 11:21:01 node4 kernel: Lustre: 4812:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 5b1bc354-6528-1343-0f2b-6a449c0cfe3e reconnecting Feb 22 11:21:01 node4 kernel: Lustre: 4812:0:(ldlm_lib.c:497:target_handle_reconnect()) Skipped 3 previous similar messages Feb 22 11:21:05 node4 kernel: Lustre: 4801:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: bf28f4b2-f9aa-5d83-a1a3-84964a8b525c reconnecting Feb 22 11:21:10 node4 kernel: Lustre: 4787:0:(ldlm_lib.c:709:target_handle_connect()) hallmark-OST0004: refuse reconnection from 8b167c6e-719d-f424-deaf-ff06 f26cccc5 at 192.168.0.106@tcp to 0xe9c77000/3 Feb 22 11:21:10 node4 kernel: LustreError: 4787:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error (-16) req at e8539a00 x78923/t0 o8->8b167c6e-71 9d-f424-deaf-ff06f26cccc5 at NET_0x20000c0a8006a_UUID:-1 lens 304/200 ref 0 fl Interpret:/0/0 rc -16/0 Feb 22 11:21:20 node4 kernel: Lustre: 4731:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 56ad5d46-5237-688d-38d0-88655ff809bc reconnecting Feb 22 11:21:20 node4 kernel: Lustre: 4731:0:(ldlm_lib.c:497:target_handle_reconnect()) Skipped 4 previous similar messages Feb 22 11:21:53 node4 kernel: Lustre: hallmark-OST0004: haven''t heard from client 4493c464-67d6-6825-7062-f932d392c1df (at 192.168.0.168 at tcp) in 222 seconds. I think it''s dead, and I am evicting it. Feb 22 11:21:53 node4 kernel: Lustre: hallmark-OST0004: haven''t heard from client 9762fef8-bb47-ca87-d2cd-7c439607c523 (at 192.168.0.158 at tcp) in 212 seconds. I think it''s dead, and I am evicting it. Other side: Feb 22 11:16:21 node3 kernel: Lustre: hallmark-OST0003: haven''t heard from client 11e65f33-019b-c3cc-17d9-2ccf559a86cd (at 192.168.0.173 at tcp) in 227 seconds. I think it''s dead, and I am evicting it. Feb 22 11:20:13 node3 kernel: LustreError: 16617:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID ''hallmark-OST0004_UUID'' is not available for connect (n o target) req at f06ce600 x182177/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc 0/0 Feb 22 11:20:13 node3 kernel: LustreError: 16617:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error (-19) req at f06ce600 x182177/t0 o8-><?>@<?>:- 1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0 Feb 22 11:20:13 node3 kernel: LustreError: 16607:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID ''hallmark-OST0004_UUID'' is not available for connect (n o target) req at e6b56000 x9091/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc 0/0 Feb 22 11:20:13 node3 kernel: LustreError: 16607:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error (-19) req at e6b56000 x9091/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0 Feb 22 11:20:13 node3 kernel: LustreError: 16604:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID ''hallmark-OST0004_UUID'' is not available for connect (n o target) req at ee1f2200 x40504/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc 0/0 Feb 22 11:20:13 node3 kernel: LustreError: 16604:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error (-19) req at ee1f2200 x40504/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0 Feb 22 11:20:17 node3 kernel: LustreError: 16597:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID ''hallmark-OST0004_UUID'' is not available for connect (n o target) req at ee1f2c00 x145702/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc 0/0 Feb 22 11:20:17 node3 kernel: LustreError: 16597:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error (-19) req at ee1f2c00 x145702/t0 o8-><?>@<?>:- 1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0 Feb 22 11:20:18 node3 kernel: LustreError: 16605:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID ''hallmark-OST0004_UUID'' is not available for connect (n o target) req at d88bd800 x38335/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc 0/0 Feb 22 11:20:18 node3 kernel: LustreError: 16605:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 1 previous similar message Feb 22 11:20:18 node3 kernel: LustreError: 16605:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error (-19) req at d88bd800 x38335/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0 Feb 22 11:20:18 node3 kernel: LustreError: 16605:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 1 previous similar message Feb 22 11:20:20 node3 kernel: LustreError: 16612:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID ''hallmark-OST0004_UUID'' is not available for connect (n o target) req at ee1f2200 x142768/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc 0/0 Feb 22 11:20:20 node3 kernel: LustreError: 16612:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 4 previous similar messages Feb 22 11:20:20 node3 kernel: LustreError: 16612:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error (-19) req at ee1f2200 x142768/t0 o8-><?>@<?>:- 1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0 Feb 22 11:20:20 node3 kernel: LustreError: 16612:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 4 previous similar messages Feb 22 11:20:27 node3 kernel: LustreError: 16611:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID ''hallmark-OST0004_UUID'' is not available for connect (n o target) req at c3c3f000 x7268/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc 0/0 Feb 22 11:20:27 node3 kernel: LustreError: 16611:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 1 previous similar message Feb 22 11:20:27 node3 kernel: LustreError: 16611:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error (-19) req at c3c3f000 x7268/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0 Feb 22 11:20:27 node3 kernel: LustreError: 16611:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 1 previous similar message Feb 22 11:20:32 node3 kernel: LustreError: 16608:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID ''hallmark-OST0004_UUID'' is not available for connect (n o target) req at c3c3f000 x171/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc 0/0 Feb 22 11:20:32 node3 kernel: LustreError: 16608:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 6 previous similar messages Feb 22 11:20:32 node3 kernel: LustreError: 16608:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error (-19) req at c3c3f000 x171/t0 o8-><?>@<?>:-1 l ens 304/0 ref 0 fl Interpret:/0/0 rc -19/0 Feb 22 11:20:32 node3 kernel: LustreError: 16608:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 6 previous similar messages Feb 22 11:20:46 node3 kernel: LustreError: 16604:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID ''hallmark-OST0004_UUID'' is not available for connect (n o target) req at ee1f2200 x18416/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc 0/0 Feb 22 11:20:46 node3 kernel: LustreError: 16604:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 7 previous similar messages Feb 22 11:20:46 node3 kernel: LustreError: 16604:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error (-19) req at ee1f2200 x18416/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0 Feb 22 11:20:46 node3 kernel: LustreError: 16604:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 7 previous similar messages Feb 22 11:21:08 node3 kernel: LustreError: 16599:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID ''hallmark-OST0004_UUID'' is not available for connect (n o target) req at d2334800 x380465/t0 o8-><?>@<?>:-1 lens 240/0 ref 0 fl Interpret:/0/0 rc 0/0 Feb 22 11:21:08 node3 kernel: LustreError: 16599:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 3 previous similar messages Feb 22 11:21:08 node3 kernel: LustreError: 16599:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error (-19) req at d2334800 x380465/t0 o8-><?>@<?>:- 1 lens 240/0 ref 0 fl Interpret:/0/0 rc -19/0 Feb 22 11:21:08 node3 kernel: LustreError: 16599:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 3 previous similar messages Feb 22 11:22:23 node3 kernel: LustreError: 16227:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID ''hallmark-OST0004_UUID'' is not available for connect (n o target) req at f7c9bc2c x11418/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc 0/0 Feb 22 11:22:23 node3 kernel: LustreError: 16227:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 1 previous similar message Feb 22 11:22:23 node3 kernel: LustreError: 16227:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error (-19) req at f7c9bc2c x11418/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0 Feb 22 11:22:23 node3 kernel: LustreError: 16227:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 1 previous similar message Feb 22 11:23:49 node3 kernel: Lustre: hallmark-OST0003: haven''t heard from client 9762fef8-bb47-ca87-d2cd-7c439607c523 (at 192.168.0.158 at tcp) in 227 seconds. I think it''s dead, and I am evicting it. The cluster is now online. But what''s going on? What is the router notify message, why was it lost the connections with lnet? I just can''t figure out, hat going on. Thank you very much. tamas