search for: o2net_idle_tim

Displaying 20 results from an estimated 27 matches for "o2net_idle_tim".

Did you mean: o2net_idle_timer
2009 Mar 18
2
shutdown by o2net_idle_timer causes Xen to hang
...og/messages of the node that issued the shutdown (see below) and the nearly five hour gap in the logs of the other node. Mar 15 14:39:47 ugc-1 kernel: o2net: connection to node cod-2 (num 3) at 10.0.0.42:7777 has been idle for 30.0 seconds, shutting it down. Mar 15 14:39:47 ugc-1 kernel: (0,0):o2net_idle_timer:1476 here are some times that might help debug the situation: (tmr 1237124357.624587 now 1237124387.624394 dr 1237124357.624578 adv 1237124357.624588:1237124357.624589 func (be795f6d:507) 1237124191.594238:1237124191.594242) Mar 15 14:39:47 ugc-1 kernel: o2net: no longer connected to node...
2009 Jul 22
2
OCFS2 Node restart
...odes are accessing common share on OCFS. --------------------------------------------------------- -Jul 22 09:01:25 172.25.29.10 kernel: o2net: connection to node alf3 (num 3) at 172.25.29.13:7777 has been idle for 30.0 secon ds, shutting it down. -Jul 22 09:01:25 172.25.29.10 kernel: (0,1):o2net_idle_timer:1506 here are some times that might help debug the situation: (tm r 1248267655.660420 now 1248267685.655778 dr 1248267655.660405 adv 1248267655.660422:1248267655.660423 func (0ffa2aed:505) 12 48267647.662032:1248267647.662034) -Jul 22 09:01:25 172.25.29.10 kernel: o2net: no longer connected t...
2009 Nov 06
0
iscsi connection drop, comes back in seconds, then deadlock in cluster
...5BAB5F5AA1C93D7E0: waiting 5000ms for notification of death of node 8 Logs at time of failure on Node 7 (rack105): Nov 6 01:00:38 rack105 kernel: o2net: connection to node mgr01 (num 0) at 10.244.1.100:7777 has been idle for 30.0 seconds, shutting it down. Nov 6 01:00:38 rack105 kernel: (0,1):o2net_idle_timer:1503 here are some times that might help debug the situation: (tmr 1257498008.773099 now 1257498038.772814 dr 1257498008.773068 adv 1257498008.773099:1257498008.773100 func (6055cf71:502) 1257498002.348395:1257498002.348398) Nov 6 01:00:38 rack105 kernel: o2net: no longer connected to node m...
2008 Jul 14
1
Node fence on RHEL4 machine running 1.2.8-2
...ote_convert_request' and 'dlm_wait_for_node_death'). At 5:35:53 this morning, all local disk logging activity on node1 stopped. Our Nagios server could continue to ping node1 during this time. Meanwhile, in the remote syslogs, this started happening: Jul 14 05:36:22 node1 (27575,0):o2net_idle_timer:1426 here are some times that might help debug the situation: (tmr 1215977752.923279 now 1215977782.914919+dr 1215977754.923101 adv 1215977752.923285:1215977752.923288 func (88eefa53:504) 1215977375.25138:1215977375.25147) Jul 14 05:36:23 node1 (27575,0):o2net_idle_timer:1426 here are some times...
2006 Apr 18
1
Self-fencing issues (RHEL4)
...107 Apr 18 15:56:46 rac1/rac1 (19545,0):ocfs2_replay_journal:1172 Recovering node 1 from slot 1 on device (8,41) Apr 18 15:56:46 rac1/rac1 (19544,0):ocfs2_replay_journal:1172 Recovering node 1 from slot 1 on device (8,37) Apr 18 15:56:51 rac2/rac2 <0>Rebooting in 60 seconds..<5>(3,0):o2net_idle_timer:1310 connection to node rac1 (num 0) at 10.0.1.1:7777 has been idle for 10 seconds, shutting it down. Apr 18 15:56:51 rac2/rac2 (3,0):o2net_idle_timer:1321 here are some times that might help debug the situation: (tmr 1145401001.986417 now 1145401011.984614 dr 1145401005.947636 adv 1145401001...
2009 Jun 09
6
question about oracle shared home install
Hi All, Scenario: I'm trying to install 9i rac on a 2 node cluster on OCFS2 OS: Oracle enterprise linux To my understanding, OCFS2 supports shared home installs which to my knowledge is not only can i have datafile and control files but also clustermanager files and binaries (pretty much everything: no files or executables need to kept local to any nodes). I have one single shared file for
2007 Feb 06
2
Network 10 sec timeout setting?
Hello! Hey didnt a setting for the 10 second network timeout get into the 2.6.20 kernel? if so how do we set this? I am getting OCFS2 1.3.3 (2201,0):o2net_connect_expired:1547 ERROR: no connection established with node 1 after 10.0 seconds, giving up and returning errors. (2458,0):dlm_request_join:802 ERROR: status = -107 (2458,0):dlm_try_to_join_domain:950 ERROR: status = -107
2009 Apr 20
2
BUG: soft lockup - CPU#1 stuck for 61s
...6 um-be-2 [145813.022397] o2net: no longer connected to node um-fe-1 (num 1) at 192.168.10.10:7777 Apr 20 17:31:16 um-fe-1 [ 9087.529912] o2net: connection to node um-be-1 (num 3) at 192.168.10.20:7777 has been idle for 30.0 seconds, shutting it down. Apr 20 17:31:16 um-fe-1 [ 9087.529971] (4614,1):o2net_idle_timer:1468 here are some times that might help debug the situation: (tmr 1240219828.837488 now 1240219858.837654 dr 1240219858.834946 adv 1240219828.837494:1240219828.837496 func (d5a868ed:502) 1240219802.621728:1240219802.621733) Apr 20 17:31:16 um-fe-1 [ 9087.529971] o2net: connection to node um-be-2...
2023 Jun 27
0
[PATCH] fs: ocfs: fix potential deadlock on &qs->qs_lock
As &qs->qs_lock is also acquired by the timer o2net_idle_timer() which executes under softirq context, code executing under process context should disable irq before acquiring the lock, otherwise deadlock could happen if the process context hold the lock then preempt by the timer. Possible deadlock scenario: o2quo_make_decision (workqueue) -> spin_lo...
2023 Jun 27
0
[PATCH] fs: ocfs: fix potential deadlock on &qs->qs_lock
As &qs->qs_lock is also acquired by the timer o2net_idle_timer() which executes under softirq context, code executing under process context should disable irq before acquiring the lock, otherwise deadlock could happen if the process context hold the lock then preempt by the timer. Possible deadlock scenario: o2quo_make_decision (workqueue) -> spin_lo...
2009 Jul 29
3
Error message whil booting system
...7BE7E9E2026A40F8801B56257D805C88"): 0 1 2 3 4 5 Kernel log from another node alf1 for above node alf3 is like Jul 29 10:15:57 alf1 kernel: o2net: connection to node alf3 (num 3) at 172.25.29.13:7777 has been idle for 30.0 seconds, shut ting it down. Jul 29 10:15:57 alf1 kernel: (0,1):o2net_idle_timer:1506 here are some times that might help debug the situation: (tmr 124887 6927.861591 now 1248876957.858464 dr 1248876927.861556 adv 1248876927.861622:1248876927.861623 func (0ffa2aed:506) 1248876927 .861592:1248876927.861604) Jul 29 10:15:57 alf1 kernel: o2net: no longer connected to node alf...
2010 Oct 23
1
Reg: ocfs2 two node cluster crashed, node2 crashed, when I rebooted node1 for maintenance.
...539603 Oct 23 15:42:58 node2 kernel: ocfs2_dlm: Nodes in domain ("D96AC8E8BDD54913AE6D8EC0EB539603"): 2 Oct 23 15:44:06 node2 kernel: o2net: connection to node node1 (num 1) at 192.168.3.1:7777 has been idle for 60 .0 seconds, shutting it down. Oct 23 15:44:06 node2 kernel: (swapper,0,15):o2net_idle_timer:1503 here are some times that might help debug the situa tion: (tmr 1287848586.872368 now 1287848646.872227 dr 1287848586.872346 adv 1287848586.872376:1287848586.872376 func (fb860756 :513) 1287848578.874476:1287848578.874487) Oct 23 15:44:06 node2 kernel: o2net: no longer connected to node node1...
2008 Feb 04
0
[PATCH] o2net: Reconnect after idle time out.
...8 @@ static void o2net_set_nn_state(struct o2 delay = 0; mlog(ML_CONN, "queueing conn attempt in %lu jiffies\n", delay); queue_delayed_work(o2net_wq, &nn->nn_connect_work, delay); + queue_delayed_work(o2net_wq, &nn->nn_connect_expired, + delay + msecs_to_jiffies(o2net_idle_timeout(sc->sc_node))); } /* keep track of the nn's sc ref for the caller */ @@ -1193,6 +1188,7 @@ static int o2net_check_handshake(struct * shut down already */ if (nn->nn_sc == sc) { o2net_sc_reset_idle_timer(sc); + nn->nn_timeout = 0; o2net_set_nn_state(nn, sc, 1, 0);...
2010 Jan 14
1
another fencing question
Hi, periodically one of on my two nodes cluster is fenced here are the logs: Jan 14 07:01:44 nvr1-rc kernel: o2net: no longer connected to node nvr2- rc.minint.it (num 0) at 1.1.1.6:7777 Jan 14 07:01:44 nvr1-rc kernel: (21534,1):dlm_do_master_request:1334 ERROR: link to 0 went down! Jan 14 07:01:44 nvr1-rc kernel: (4007,4):dlm_send_proxy_ast_msg:458 ERROR: status = -112 Jan 14 07:01:44
2010 Dec 09
2
servers blocked on ocfs2
...he lines in their messages files: =====node heraclito (0)======================================== /Dec 4 09:15:06 heraclito kernel: o2net: connection to node parmenides (num 1) at 192.168.1.2:7777 has been idle for 30.0 seconds, shutting it down. Dec 4 09:15:06 heraclito kernel: (swapper,0,7):o2net_idle_timer:1503 here are some times that might help debug the situation: (tmr 1291450476.228826 now 1291450506.229456 dr 1291450476.228760 adv 1291450476.228842:1291450476.228843 func (de6e01eb:500) 1291450476.228827:1291450476.228829) Dec 4 09:15:06 heraclito kernel: o2net: no longer connected to node...
2007 Feb 26
1
dlm timeouts and following errors -112
...26 21:03:47 ppsbackup101 heartbeat: [5394]: ERROR: Irretrievably lost packet: node ppsdb102 seq 6 Feb 26 21:04:32 ppsbackup101 kernel: o2net: connection to node ppsnfs102 (num 3) at 192.168.102.32:7777 has been idle for 300.0 seconds, shutting it down. Feb 26 21:04:32 ppsbackup101 kernel: (5394,1):o2net_idle_timer:1426 here are some times that might help debug the situation: (tmr 1172519972.626184 now 1172520272.653263 dr 1172519972.626167 adv 1172519972.626208:1172519972.626210 func (666c6172:510) 1172519972.626186:1172519972.626195) Feb 26 21:04:32 ppsbackup101 kernel: o2net: no longer connected to node...
2009 Feb 04
1
Strange dmesg messages
...rebooted and the another we restarted the nfsd and it brought him back. Looking at node #0 - the one that rebooted - logs everything seems normal, but looking at the othere node dmesg's we saw this messages: First the o2net detected that node #0 was dead: (It seems everything OK here) (0,0):o2net_idle_timer:1422 here are some times that might help debug the situation: (tmr 1233748167.271522 now 1233748227.272666 dr 1233748167.271516 adv 1233748167.271532:1233748167.271533 func (300d6acb:500) 1233748167.271522:1233748167.271526) o2net: no longer connected to node soap02 (num 0) at 192.168.0.10:777...
2008 Feb 13
2
[PATCH] o2net: Reconnect after idle time out.V2
...that run + * through here but we only cancel the connect_expired work when + * a connection attempt succeeds. So only the first enqueue of + * the connect_expired work will do anything. The rest will see + * that it's already queued and do nothing. + */ + delay += msecs_to_jiffies(o2net_idle_timeout(sc->sc_node)); + queue_delayed_work(o2net_wq, &nn->nn_connect_expired, delay); } /* keep track of the nn's sc ref for the caller */ @@ -1193,6 +1198,7 @@ static int o2net_check_handshake(struct * shut down already */ if (nn->nn_sc == sc) { o2net_sc_reset_idle_ti...
2014 Sep 26
2
One node hangs up issue requiring goog idea, thanks
...will still waiting for the dlm. CAS2/logdir/var/log/syslog.1-6778-Sep 16 20:57:16 CAS2 kernel: [516366.623623] o2net: Connection to node CAS1 (num 1) at 10.172.254.1:7100 has been idle for 30.87 secs, shutting it down. CAS2/logdir/var/log/syslog.1-6779-Sep 16 20:57:16 CAS2 kernel: [516366.623631] o2net_idle_timer 1621: Local and remote node is heartbeating, and try connect CAS2/logdir/var/log/syslog.1-6780-Sep 16 20:57:16 CAS2 kernel: [516366.623792] o2net: No longer connected to node CAS1 (num 1) at 10.172.254.1:7100 CAS2/logdir/var/log/syslog.1:6781:Sep 16 20:57:16 CAS2 kernel: [516366.623881] (dlm_thre...
2007 Nov 29
1
Troubles with two node
...name = web-ha2 cluster = ocfs2 cluster: node_count = 2 name = ocfs2 web-ha1:~ # Nov 28 15:28:59 web-ha2 kernel: o2net: connection to node web-ha1 (num 0) at 192.168.255.1:7777 has been idle for 10 seconds, shutting it down. Nov 28 15:28:59 web-ha2 kernel: (23432,0):o2net_idle_timer:1297 here are some times that might help debug the situation: (tmr 1196260129.36511 now 1196260139 .34907 dr 1196260129.36503 adv 1196260129.36514:1196260129.36515 func (95bc84eb:504) 1196260129.36329:1196260129.36337) Nov 28 15:28:59 web-ha2 kernel: o2net: no longer connected to node web-ha1 (nu...