Displaying 20 results from an estimated 27 matches for "o2net_idle_timer".
2009 Mar 18
2
shutdown by o2net_idle_timer causes Xen to hang
...og/messages of the node that issued the shutdown (see below) and
the nearly five hour gap in the logs of the other node.
Mar 15 14:39:47 ugc-1 kernel: o2net: connection to node cod-2 (num 3)
at 10.0.0.42:7777 has been idle for 30.0 seconds, shutting it down.
Mar 15 14:39:47 ugc-1 kernel: (0,0):o2net_idle_timer:1476 here are
some times that might help debug the situation: (tmr 1237124357.624587
now 1237124387.624394 dr 1237124357.624578 adv
1237124357.624588:1237124357.624589 func (be795f6d:507)
1237124191.594238:1237124191.594242)
Mar 15 14:39:47 ugc-1 kernel: o2net: no longer connected to node c...
2009 Jul 22
2
OCFS2 Node restart
...odes are accessing common share on
OCFS.
---------------------------------------------------------
-Jul 22 09:01:25 172.25.29.10 kernel: o2net: connection to node alf3 (num 3)
at 172.25.29.13:7777 has been idle for 30.0 secon
ds, shutting it down.
-Jul 22 09:01:25 172.25.29.10 kernel: (0,1):o2net_idle_timer:1506 here are
some times that might help debug the situation: (tm
r 1248267655.660420 now 1248267685.655778 dr 1248267655.660405 adv
1248267655.660422:1248267655.660423 func (0ffa2aed:505) 12
48267647.662032:1248267647.662034)
-Jul 22 09:01:25 172.25.29.10 kernel: o2net: no longer connected to...
2009 Nov 06
0
iscsi connection drop, comes back in seconds, then deadlock in cluster
...5BAB5F5AA1C93D7E0: waiting 5000ms for notification of
death of node 8
Logs at time of failure on Node 7 (rack105):
Nov 6 01:00:38 rack105 kernel: o2net: connection to node mgr01 (num 0)
at 10.244.1.100:7777 has been idle for 30.0 seconds, shutting it down.
Nov 6 01:00:38 rack105 kernel: (0,1):o2net_idle_timer:1503 here are
some times that might help debug the situation: (tmr 1257498008.773099
now 1257498038.772814 dr 1257498008.773068 adv
1257498008.773099:1257498008.773100 func (6055cf71:502)
1257498002.348395:1257498002.348398)
Nov 6 01:00:38 rack105 kernel: o2net: no longer connected to node mgr...
2008 Jul 14
1
Node fence on RHEL4 machine running 1.2.8-2
...ote_convert_request' and 'dlm_wait_for_node_death').
At 5:35:53 this morning, all local disk logging activity on node1
stopped. Our Nagios server could continue to ping node1 during this
time. Meanwhile, in the remote syslogs, this started happening:
Jul 14 05:36:22 node1 (27575,0):o2net_idle_timer:1426 here are some
times that might help debug the situation: (tmr 1215977752.923279 now
1215977782.914919+dr 1215977754.923101 adv
1215977752.923285:1215977752.923288 func (88eefa53:504)
1215977375.25138:1215977375.25147)
Jul 14 05:36:23 node1 (27575,0):o2net_idle_timer:1426 here are some
times t...
2006 Apr 18
1
Self-fencing issues (RHEL4)
...107
Apr 18 15:56:46 rac1/rac1 (19545,0):ocfs2_replay_journal:1172 Recovering
node 1 from slot 1 on device (8,41)
Apr 18 15:56:46 rac1/rac1 (19544,0):ocfs2_replay_journal:1172 Recovering
node 1 from slot 1 on device (8,37)
Apr 18 15:56:51 rac2/rac2 <0>Rebooting in 60
seconds..<5>(3,0):o2net_idle_timer:1310 connection to node rac1 (num 0)
at 10.0.1.1:7777 has been idle for 10 seconds, shutting it down.
Apr 18 15:56:51 rac2/rac2 (3,0):o2net_idle_timer:1321 here are some
times that might help debug the situation: (tmr 1145401001.986417 now
1145401011.984614 dr 1145401005.947636 adv
1145401001.9...
2009 Jun 09
6
question about oracle shared home install
Hi All,
Scenario: I'm trying to install 9i rac on a 2 node cluster on OCFS2
OS: Oracle enterprise linux
To my understanding, OCFS2 supports shared home installs which to my
knowledge is not only can i have datafile and control files but also
clustermanager files and binaries (pretty much everything: no files or
executables need to kept local to any nodes). I have one single shared
file for
2007 Feb 06
2
Network 10 sec timeout setting?
Hello!
Hey didnt a setting for the 10 second network timeout get into the
2.6.20 kernel?
if so how do we set this?
I am getting
OCFS2 1.3.3
(2201,0):o2net_connect_expired:1547 ERROR: no connection established
with node 1 after 10.0 seconds, giving up and returning errors.
(2458,0):dlm_request_join:802 ERROR: status = -107
(2458,0):dlm_try_to_join_domain:950 ERROR: status = -107
2009 Apr 20
2
BUG: soft lockup - CPU#1 stuck for 61s
...6 um-be-2 [145813.022397] o2net: no longer connected to
node um-fe-1 (num 1) at 192.168.10.10:7777
Apr 20 17:31:16 um-fe-1 [ 9087.529912] o2net: connection to node
um-be-1 (num 3) at 192.168.10.20:7777 has been idle for 30.0 seconds,
shutting it down.
Apr 20 17:31:16 um-fe-1 [ 9087.529971] (4614,1):o2net_idle_timer:1468
here are some times that might help debug the situation: (tmr
1240219828.837488 now 1240219858.837654 dr 1240219858.834946 adv
1240219828.837494:1240219828.837496 func (d5a868ed:502)
1240219802.621728:1240219802.621733)
Apr 20 17:31:16 um-fe-1 [ 9087.529971] o2net: connection to node
um-be-2 (...
2023 Jun 27
0
[PATCH] fs: ocfs: fix potential deadlock on &qs->qs_lock
As &qs->qs_lock is also acquired by the timer o2net_idle_timer()
which executes under softirq context, code executing under process
context should disable irq before acquiring the lock, otherwise
deadlock could happen if the process context hold the lock then
preempt by the timer.
Possible deadlock scenario:
o2quo_make_decision (workqueue)
-> spin_lock...
2023 Jun 27
0
[PATCH] fs: ocfs: fix potential deadlock on &qs->qs_lock
As &qs->qs_lock is also acquired by the timer o2net_idle_timer()
which executes under softirq context, code executing under process
context should disable irq before acquiring the lock, otherwise
deadlock could happen if the process context hold the lock then
preempt by the timer.
Possible deadlock scenario:
o2quo_make_decision (workqueue)
-> spin_lock...
2009 Jul 29
3
Error message whil booting system
...7BE7E9E2026A40F8801B56257D805C88"): 0 1 2 3 4 5
Kernel log from another node alf1 for above node alf3 is like
Jul 29 10:15:57 alf1 kernel: o2net: connection to node alf3 (num 3) at
172.25.29.13:7777 has been idle for 30.0 seconds, shut
ting it down.
Jul 29 10:15:57 alf1 kernel: (0,1):o2net_idle_timer:1506 here are some times
that might help debug the situation: (tmr 124887
6927.861591 now 1248876957.858464 dr 1248876927.861556 adv
1248876927.861622:1248876927.861623 func (0ffa2aed:506) 1248876927
.861592:1248876927.861604)
Jul 29 10:15:57 alf1 kernel: o2net: no longer connected to node alf3...
2010 Oct 23
1
Reg: ocfs2 two node cluster crashed, node2 crashed, when I rebooted node1 for maintenance.
...539603
Oct 23 15:42:58 node2 kernel: ocfs2_dlm: Nodes in domain
("D96AC8E8BDD54913AE6D8EC0EB539603"): 2
Oct 23 15:44:06 node2 kernel: o2net: connection to node node1 (num 1) at
192.168.3.1:7777 has been idle for 60
.0 seconds, shutting it down.
Oct 23 15:44:06 node2 kernel: (swapper,0,15):o2net_idle_timer:1503 here are
some times that might help debug the situa
tion: (tmr 1287848586.872368 now 1287848646.872227 dr 1287848586.872346 adv
1287848586.872376:1287848586.872376 func (fb860756
:513) 1287848578.874476:1287848578.874487)
Oct 23 15:44:06 node2 kernel: o2net: no longer connected to node node1 (...
2008 Feb 04
0
[PATCH] o2net: Reconnect after idle time out.
...nt o2net_check_handshake(struct
* shut down already */
if (nn->nn_sc == sc) {
o2net_sc_reset_idle_timer(sc);
+ nn->nn_timeout = 0;
o2net_set_nn_state(nn, sc, 1, 0);
}
spin_unlock(&nn->nn_lock);
@@ -1391,6 +1387,7 @@ static void o2net_sc_send_keep_req(struc
static void o2net_idle_timer(unsigned long data)
{
struct o2net_sock_container *sc = (struct o2net_sock_container *)data;
+ struct o2net_node *nn = o2net_nn_from_num(sc->sc_node->nd_num);
struct timeval now;
do_gettimeofday(&now);
@@ -1413,6 +1410,14 @@ static void o2net_idle_timer(unsigned lo
sc->...
2010 Jan 14
1
another fencing question
Hi,
periodically one of on my two nodes cluster is fenced here are the logs:
Jan 14 07:01:44 nvr1-rc kernel: o2net: no longer connected to node nvr2-
rc.minint.it (num 0) at 1.1.1.6:7777
Jan 14 07:01:44 nvr1-rc kernel: (21534,1):dlm_do_master_request:1334 ERROR:
link to 0 went down!
Jan 14 07:01:44 nvr1-rc kernel: (4007,4):dlm_send_proxy_ast_msg:458 ERROR:
status = -112
Jan 14 07:01:44
2010 Dec 09
2
servers blocked on ocfs2
...he
lines in their messages files:
=====node heraclito (0)========================================
/Dec 4 09:15:06 heraclito kernel: o2net: connection to node parmenides
(num 1) at 192.168.1.2:7777 has been idle for 30.0 seconds, shutting it
down.
Dec 4 09:15:06 heraclito kernel: (swapper,0,7):o2net_idle_timer:1503
here are some times that might help debug the situation: (tmr
1291450476.228826
now 1291450506.229456 dr 1291450476.228760 adv
1291450476.228842:1291450476.228843 func (de6e01eb:500)
1291450476.228827:1291450476.228829)
Dec 4 09:15:06 heraclito kernel: o2net: no longer connected to node...
2007 Feb 26
1
dlm timeouts and following errors -112
...26 21:03:47 ppsbackup101 heartbeat: [5394]: ERROR: Irretrievably lost
packet: node ppsdb102 seq 6
Feb 26 21:04:32 ppsbackup101 kernel: o2net: connection to node ppsnfs102 (num
3)
at 192.168.102.32:7777 has been idle for 300.0 seconds, shutting it down.
Feb 26 21:04:32 ppsbackup101 kernel: (5394,1):o2net_idle_timer:1426 here are
some times that might help debug the situation: (tmr 1172519972.626184 now
1172520272.653263 dr 1172519972.626167 adv 1172519972.626208:1172519972.626210
func (666c6172:510) 1172519972.626186:1172519972.626195)
Feb 26 21:04:32 ppsbackup101 kernel: o2net: no longer connected to node
pp...
2009 Feb 04
1
Strange dmesg messages
...rebooted and the
another we restarted the nfsd and it brought him back.
Looking at node #0 - the one that rebooted - logs everything seems
normal, but looking at the othere node dmesg's we saw this messages:
First the o2net detected that node #0 was dead: (It seems everything OK
here)
(0,0):o2net_idle_timer:1422 here are some times that might help debug
the situation: (tmr 1233748167.271522 now 1233748227.272666 dr
1233748167.271516 adv 1233748167.271532:1233748167.271533 func
(300d6acb:500) 1233748167.271522:1233748167.271526)
o2net: no longer connected to node soap02 (num 0) at 192.168.0.10:7777...
2008 Feb 13
2
[PATCH] o2net: Reconnect after idle time out.V2
...andshake(struct
* shut down already */
if (nn->nn_sc == sc) {
o2net_sc_reset_idle_timer(sc);
+ atomic_set(&nn->nn_timeout, 0);
o2net_set_nn_state(nn, sc, 1, 0);
}
spin_unlock(&nn->nn_lock);
@@ -1391,6 +1397,7 @@ static void o2net_sc_send_keep_req(struc
static void o2net_idle_timer(unsigned long data)
{
struct o2net_sock_container *sc = (struct o2net_sock_container *)data;
+ struct o2net_node *nn = o2net_nn_from_num(sc->sc_node->nd_num);
struct timeval now;
do_gettimeofday(&now);
@@ -1413,6 +1420,12 @@ static void o2net_idle_timer(unsigned lo
sc->...
2014 Sep 26
2
One node hangs up issue requiring goog idea, thanks
...will still waiting for the dlm.
CAS2/logdir/var/log/syslog.1-6778-Sep 16 20:57:16 CAS2 kernel: [516366.623623] o2net: Connection to node CAS1 (num 1) at 10.172.254.1:7100 has been idle for 30.87 secs, shutting it down.
CAS2/logdir/var/log/syslog.1-6779-Sep 16 20:57:16 CAS2 kernel: [516366.623631] o2net_idle_timer 1621: Local and remote node is heartbeating, and try connect
CAS2/logdir/var/log/syslog.1-6780-Sep 16 20:57:16 CAS2 kernel: [516366.623792] o2net: No longer connected to node CAS1 (num 1) at 10.172.254.1:7100
CAS2/logdir/var/log/syslog.1:6781:Sep 16 20:57:16 CAS2 kernel: [516366.623881] (dlm_thread...
2007 Nov 29
1
Troubles with two node
...name = web-ha2
cluster = ocfs2
cluster:
node_count = 2
name = ocfs2
web-ha1:~ #
Nov 28 15:28:59 web-ha2 kernel: o2net: connection to node web-ha1 (num
0) at 192.168.255.1:7777 has been idle for 10 seconds, shutting it down.
Nov 28 15:28:59 web-ha2 kernel: (23432,0):o2net_idle_timer:1297 here are
some times that might help debug the situation: (tmr 1196260129.36511
now 1196260139
.34907 dr 1196260129.36503 adv 1196260129.36514:1196260129.36515 func
(95bc84eb:504) 1196260129.36329:1196260129.36337)
Nov 28 15:28:59 web-ha2 kernel: o2net: no longer connected to node
web-ha1 (num...