Displaying 20 results from an estimated 58 matches for "o2net".
Did you mean:
n2net
2009 Nov 06
0
iscsi connection drop, comes back in seconds, then deadlock in cluster
..., now
4325774337
Nov 6 01:00:12 mgr01 kernel: connection1:0: detected conn error (1011)
Nov 6 01:00:13 mgr01 iscsid: Kernel reported iSCSI connection 1:0 error
(1011) state (3)
Nov 6 01:00:15 mgr01 iscsid: connection1:0 is operational after
recovery (1 attempts)
Nov 6 01:00:38 mgr01 kernel: o2net: no longer connected to node rack105
(num 7) at 10.244.1.105:7777
Nov 6 01:00:38 mgr01 kernel:
(3270,0):dlm_send_remote_convert_request:395 ERROR: status = -112
Nov 6 01:00:38 mgr01 kernel: (3270,0):dlm_wait_for_node_death:370
4FF4E858AF6E4AEEB2650A543A320C2F: waiting 5000ms for notification o...
2009 Jul 29
3
Error message whil booting system
...S2 DLM 1.4.2 Wed Jul 1 19:55:44 PDT 2009
(build 0faae8d4263a8c594749be558d8d7edd)
Jul 27 10:02:21 alf3 kernel: OCFS2 DLMFS 1.4.2 Wed Jul 1 19:55:44 PDT 2009
(build 0faae8d4263a8c594749be558d8d7edd)
Jul 27 10:02:21 alf3 kernel: OCFS2 User DLM kernel interface loaded
Jul 27 10:02:25 alf3 kernel: o2net: connected to node alf0 (num 0) at
172.25.29.10:7777
Jul 27 10:02:25 alf3 kernel: o2net: connected to node alf2 (num 2) at
172.25.29.12:7777
Jul 27 10:02:25 alf3 kernel: o2net: accepted connection from node alf5 (num
5) at 172.25.29.15:7777
Jul 27 10:02:26 alf3 kernel: o2net: accepted connection...
2011 May 10
3
ERROR: -91 after Kernel Upgrade
...2-tools-1.4.3
Modules are loaded and /config type configfs and /dlm type ocfs2_dlmfs
are mounted.
server2 ~ # mount /data/
mount.ocfs2: Protocol not available while mounting /dev/sdb1 on /data.
Check ''dmesg'' for more information on this error.
server2 ~ # dmesg
[ 802.267217] o2net: accepted connection from node server4 (num 3) at
10.10.21.14:7777
[ 802.871908] o2net: accepted connection from node server3 (num 2) at
10.10.21.13:7777
[ 805.295632] (mount.ocfs2,13964,2):dlm_send_nodeinfo:1233 ERROR: node
mismatch -92, node 2
[ 805.295637] (mount.ocfs2,13964,2):dlm_try_to_...
2013 Apr 28
2
Is it one issue. Do you have some good ideas, thanks a lot.
...the log below.
Why is there the information of "Node 255 (he) is the Recovery Master for the dead node 255" in the syslog?
Why the host ZHJD-VM6 is blocked until it reboot one day time later, and what is it wait for still?
Thanks a lot.
Apr 27 17:35:59 ZHJD-VM6 kernel: [ 3734.057330] o2net: Connection to node ZHJD-VM5 (num 5) at 185.200.1.16:7100 has been idle for 30.100 secs, shutting it down.
Apr 27 17:35:59 ZHJD-VM6 kernel: [ 3734.057359] o2net: No longer connected to node ZHJD-VM5 (num 5) at 185.200.1.16:7100
Apr 27 17:35:59 ZHJD-VM6 kernel: [ 3734.058212] o2net: Connected to nod...
2013 Apr 28
2
Is it one issue. Do you have some good ideas, thanks a lot.
...the log below.
Why is there the information of "Node 255 (he) is the Recovery Master for the dead node 255" in the syslog?
Why the host ZHJD-VM6 is blocked until it reboot one day time later, and what is it wait for still?
Thanks a lot.
Apr 27 17:35:59 ZHJD-VM6 kernel: [ 3734.057330] o2net: Connection to node ZHJD-VM5 (num 5) at 185.200.1.16:7100 has been idle for 30.100 secs, shutting it down.
Apr 27 17:35:59 ZHJD-VM6 kernel: [ 3734.057359] o2net: No longer connected to node ZHJD-VM5 (num 5) at 185.200.1.16:7100
Apr 27 17:35:59 ZHJD-VM6 kernel: [ 3734.058212] o2net: Connected to nod...
2007 Feb 06
2
Network 10 sec timeout setting?
Hello!
Hey didnt a setting for the 10 second network timeout get into the
2.6.20 kernel?
if so how do we set this?
I am getting
OCFS2 1.3.3
(2201,0):o2net_connect_expired:1547 ERROR: no connection established
with node 1 after 10.0 seconds, giving up and returning errors.
(2458,0):dlm_request_join:802 ERROR: status = -107
(2458,0):dlm_try_to_join_domain:950 ERROR: status = -107
(2458,0):dlm_join_domain:1202 ERROR: status = -107
(2458,0):dlm_register_...
2009 Jul 22
2
OCFS2 Node restart
...emote logging for kernel, and here is log.
I noticed VM become non-response and suddenly reboots. I am running Alfresco
(documents sharing) application all nodes are accessing common share on
OCFS.
---------------------------------------------------------
-Jul 22 09:01:25 172.25.29.10 kernel: o2net: connection to node alf3 (num 3)
at 172.25.29.13:7777 has been idle for 30.0 secon
ds, shutting it down.
-Jul 22 09:01:25 172.25.29.10 kernel: (0,1):o2net_idle_timer:1506 here are
some times that might help debug the situation: (tm
r 1248267655.660420 now 1248267685.655778 dr 1248267655.660405...
2010 Jul 29
3
[PATCH 1/1] O2net: Disallow o2net accept connection request from itself.
Currently, o2net_accept_one() is allowed to accept a connection from
listening node itself, such a fake connection will not be successfully
established due to no handshake detected afterwards, and later end up
with triggering connecting worker in a loop.
We're going to fix this by treating such connection requ...
2010 Oct 23
1
Reg: ocfs2 two node cluster crashed, node2 crashed, when I rebooted node1 for maintenance.
...: Nodes in domain
("C54B4F6991954F98AA6A37C4F3901CD8"): 2
Oct 23 15:42:58 node2 kernel: ocfs2_dlm: Node 1 leaves domain
D96AC8E8BDD54913AE6D8EC0EB539603
Oct 23 15:42:58 node2 kernel: ocfs2_dlm: Nodes in domain
("D96AC8E8BDD54913AE6D8EC0EB539603"): 2
Oct 23 15:44:06 node2 kernel: o2net: connection to node node1 (num 1) at
192.168.3.1:7777 has been idle for 60
.0 seconds, shutting it down.
Oct 23 15:44:06 node2 kernel: (swapper,0,15):o2net_idle_timer:1503 here are
some times that might help debug the situa
tion: (tmr 1287848586.872368 now 1287848646.872227 dr 1287848586.872346 adv...
2010 Jan 14
1
another fencing question
Hi,
periodically one of on my two nodes cluster is fenced here are the logs:
Jan 14 07:01:44 nvr1-rc kernel: o2net: no longer connected to node nvr2-
rc.minint.it (num 0) at 1.1.1.6:7777
Jan 14 07:01:44 nvr1-rc kernel: (21534,1):dlm_do_master_request:1334 ERROR:
link to 0 went down!
Jan 14 07:01:44 nvr1-rc kernel: (4007,4):dlm_send_proxy_ast_msg:458 ERROR:
status = -112
Jan 14 07:01:44 nvr1-rc kernel: (4007,4...
2010 Dec 09
2
servers blocked on ocfs2
...servers (ocfs2-1.4.7)
Some days ago, two servers sharing an ocfs2 filesystem, and with quite
virtual services, stalled, in what it seems on ocfs2 issue. This are the
lines in their messages files:
=====node heraclito (0)========================================
/Dec 4 09:15:06 heraclito kernel: o2net: connection to node parmenides
(num 1) at 192.168.1.2:7777 has been idle for 30.0 seconds, shutting it
down.
Dec 4 09:15:06 heraclito kernel: (swapper,0,7):o2net_idle_timer:1503
here are some times that might help debug the situation: (tmr
1291450476.228826
now 1291450506.229456 dr 1291450476....
2009 Nov 20
3
o2net patch that avoids socket disconnect/reconnect
This fix modifies o2net layer behavior which seems to trigger some
DLM race issues during umount/evictions that needs to be fixed as well.
I am working on the dlm issues but meanwhile please review this patch.
Thanks,
--Srini
2008 Feb 04
0
[PATCH] o2net: Reconnect after idle time out.
Currently, o2net connects to a node on hb_up and disconnects on
hb_down and net timeout.
It disconnects on net timeout is ok, but it should attempt to
reconnect back. This is because sometimes nodes get overloaded
enough that the network connection breaks but the disk hb does not.
And if we get into that situation...
2011 Feb 10
0
(o2net, 6301, 0):o2net_connect_expired:1664 ERROR: no connection established with node 1 after 60.0 seconds, giving up and returning errors.
Hello,
I am installing Two Node cluster when I automount the file systems I am getting o2net_connect_expired error and it is not mounting the cluster filesystems if I mount the cluster file systems manually as mount -a it is mounting the file systems without any issues.
1.If I bring Node1 up with Node2 to down cluster file system is automounting fine without any issues.
2.I checked the c...
2008 Feb 13
2
[PATCH] o2net: Reconnect after idle time out.V2
Modification from V1 to V2:
1. Use atomic ops instead of spin_lock in timer.
2. Add some comments when querying connect_expired work.
These comments are copied form Zach's mail.;)
Currently, o2net connects to a node on hb_up and disconnects on
hb_down and net timeout.
It disconnects on net timeout is ok, but it should attempt to
reconnect back. This is because sometimes nodes get overloaded
enough that the network connection breaks but the disk hb does not.
And if we get into that situation...
2014 Sep 26
2
One node hangs up issue requiring goog idea, thanks
Hi, all,
As we use OCFS2, the network is not good.
When the converting request message can?t send to the another node, there will be a node hangs up which will still waiting for the dlm.
CAS2/logdir/var/log/syslog.1-6778-Sep 16 20:57:16 CAS2 kernel: [516366.623623] o2net: Connection to node CAS1 (num 1) at 10.172.254.1:7100 has been idle for 30.87 secs, shutting it down.
CAS2/logdir/var/log/syslog.1-6779-Sep 16 20:57:16 CAS2 kernel: [516366.623631] o2net_idle_timer 1621: Local and remote node is heartbeating, and try connect
CAS2/logdir/var/log/syslog.1-6780-Sep 16...
2007 Aug 22
1
mount.ocfs2: Value too large ...
...too large for defined data type while mounting /dev/sdb1 on /ext_arrays/ds3200_1/. Check 'dmesg' for more information on this error.
---------------
In serv_x86_64's dmesg are following lines
----------------
ocfs2_dlm: Nodes in domain ("892E82953F2147A4BD75E2AAC5750BD3"): 1
o2net: connected to node serv_i386 (num 0) at 19X.XXX.69.194:7777
ocfs2_dlm: Nodes in domain ("892E82953F2147A4BD75E2AAC5750BD3"): 0 1
kjournald starting. Commit interval 5 seconds
(11637,3):ocfs2_broadcast_vote:434 ERROR: status = -75
(11637,3):ocfs2_do_request_vote:504 ERROR: status = -75
(1...
2008 Jan 23
1
OCFS2 DLM problems
...unters showing and even during the problem
we can communicate via the bond0 interface. This setup has been running
for more then 2 months but last Wednesday morning and today again, we
had 2 nodes causing locking problems. The problem starts with messages
like this:
Jan 23 03:20:44 dbprd01 kernel: o2net: no longer connected to node
dbprd02 (num 1) at 192.168.202.2:7777
Jan 23 03:20:46 dbprd01 kernel: (5172,0):dlm_send_proxy_ast_msg:459
ERROR: status = -107
Jan 23 03:20:46 dbprd01 kernel: (5172,0):dlm_flush_asts:600 ERROR:
status = -107
Jan 23 03:20:46 dbprd01 kernel: (5172,0):dlm_send_proxy_ast_ms...
2009 Apr 20
2
BUG: soft lockup - CPU#1 stuck for 61s
?i,
I have a cluster with 5 nodes hosting web application. All web servers
save log info into shared access.log file. There is awstats log
analyzer on the first node. Sometimes this node fails with the
following messages (captured on another server)
Apr 20 17:31:16 um-be-2 [145813.022112] o2net: connection to node
um-fe-1 (num 1) at 192.168.10.10:7777 has been idle for 30.0 seconds,
shutting it down.
Apr 20 17:31:16 um-be-2 [145813.022397] o2net: no longer connected to
node um-fe-1 (num 1) at 192.168.10.10:7777
Apr 20 17:31:16 um-fe-1 [ 9087.529912] o2net: connection to node
um-be-1 (num...
2006 Jan 09
0
[PATCH 01/11] ocfs2: event-driven quorum
This patch separates o2net and o2quo from knowing about one another as much
as possible. This is the first in a series of patches that will allow
userspace cluster interaction. Quorum is separated out first, and will
ultimately only be associated with the disk heartbeat as a separate module.
To do so, this patch perform...