Displaying 6 results from an estimated 6 matches for "dlm_wait_for_node_death".
2007 Mar 08
4
ocfs2 cluster becomes unresponsive
...se-1-mht kernel: (147,0):dlm_send_remote_unlock_request:356 ERROR: status = -107 Mar 8 07:20:50 groupwise-1-mht last message repeated 255 times
Mar 8 07:20:53 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107
Mar 8 07:20:53 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2
Mar 8 07:20:58 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107
Mar 8 07:20:58 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE25...
2008 Jul 14
1
Node fence on RHEL4 machine running 1.2.8-2
...n the OCFS2 Issues
register page, but there's not a lot of detail in that issue. Anyway...
Please note that node4 didn't have remote syslogging set up when this
happened, but it was logging the same things as nodes 2 and 3 were
logging ('dlm_send_remote_convert_request' and 'dlm_wait_for_node_death').
At 5:35:53 this morning, all local disk logging activity on node1
stopped. Our Nagios server could continue to ping node1 during this
time. Meanwhile, in the remote syslogs, this started happening:
Jul 14 05:36:22 node1 (27575,0):o2net_idle_timer:1426 here are some
times that might help...
2006 May 18
0
Node crashed after remove a path
...386 GNU/Linux
rpm -qa | grep ocfs
ocfs2console-1.2.0-1
ocfs2-tools-1.2.0-1
ocfs2-2.6.9-22.ELsmp-1.2.0-1
rpm -qa | grep -i device
device-mapper-1.01.04-1.0.RHEL4
device-mapper-multipath-0.4.5-6.0.RHEL4
Console messages:
(5104,1):dlm_send_remote_convert_request:393 ERROR: status = -107
(5104,1):dlm_wait_for_node_death:285 4E4133205E3C4AD980D6BBBE4AE4014B: waiting
5000ms for notification of death of node 0
(6360,0):dlm_send_remote_convert_request:393 ERROR: status = -107
(6360,0):dlm_wait_for_node_death:285 EDB955CBD81B44C78CD9258B99F91E4C: waiting
5000ms for notification of death of node 0
(5104,1):dlm_send_re...
2008 Jan 23
1
OCFS2 DLM problems
...dbprd02 kernel: (5096,0):o2net_sendpage:868 ERROR:
sendpage of size 24 to node dbprd01 (num 0) at 192.168.202.1:7777 failed
with -11
Jan 23 03:20:44 dbprd02 kernel: o2net: no longer connected to node
dbprd01 (num 0) at 192.168.202.1:7777
After these there are plenty of more messages, such as
"dlm_wait_for_node_death", "dlm_send_remote_convert_request" on dbprd02
and "dlm_send_proxy_ast_msg", "dlm_flush_asts" on dbprd01.
We are currently running OCFS2 1.2.5, the kernel is EL4 Update 5 x86_64
(2.6.9-55.ELsmp).
I see there is one bug fixed in 1.2.6/1.2.7 related to DLM and I w...
2010 Dec 09
2
servers blocked on ocfs2
...s (num 1) at 192.168.1.2:7777
Dec 4 09:15:06 heraclito kernel:
(vzlist,22622,7):dlm_send_remote_convert_request:395 ERROR: status = -112
Dec 4 09:15:06 heraclito kernel:
(snmpd,16452,10):dlm_send_remote_convert_request:395 ERROR: status = -112
Dec 4 09:15:06 heraclito kernel:
(snmpd,16452,10):dlm_wait_for_node_death:370
0D3E49EB1F614A3EAEC0E2A74A34AFFF: waiting 5000ms for notification of de
ath of node 1
Dec 4 09:15:06 heraclito kernel:
(httpd,4615,10):dlm_do_master_request:1334 ERROR: link to 1 went down!
Dec 4 09:15:06 heraclito kernel:
(httpd,4615,10):dlm_get_lock_resource:917 ERROR: status = -112
Dec...
2009 Nov 06
0
iscsi connection drop, comes back in seconds, then deadlock in cluster
...d: connection1:0 is operational after
recovery (1 attempts)
Nov 6 01:00:38 mgr01 kernel: o2net: no longer connected to node rack105
(num 7) at 10.244.1.105:7777
Nov 6 01:00:38 mgr01 kernel:
(3270,0):dlm_send_remote_convert_request:395 ERROR: status = -112
Nov 6 01:00:38 mgr01 kernel: (3270,0):dlm_wait_for_node_death:370
4FF4E858AF6E4AEEB2650A543A320C2F: waiting 5000ms for notification of
death of node 7
Nov 6 01:00:38 mgr01 kernel: o2net: accepted connection from node
rack105 (num 7) at 10.244.1.105:7777
Nov 6 01:00:46 mgr01 kernel: connection1:0: ping timeout of 5 secs
expired, recv timeout 5, last rx...