search for: o2net_connect_expired

Displaying 20 results from an estimated 25 matches for "o2net_connect_expired".

2009 Jun 10
6
mount.ocfs2: Transport endpoint is not connected while mounting
...odes and one will not join the cluster. I have two IPs on each node, one external and one internal. I have tried changing around the IPs in the /etc/ocfs2/cluster.conf and that helped - at least I recovered three of the machines. Any suggestions on where else to look? Best Regards John (23359,2):o2net_connect_expired:1637 ERROR: no connection established with node 1 after 30.0 seconds, giving up and returning errors. (23359,2):o2net_connect_expired:1637 ERROR: no connection established with node 2 after 30.0 seconds, giving up and returning errors. (23359,2):o2net_connect_expired:1637 ERROR: no connection estab...
2009 Jun 24
1
o2net_connect_expired:1637 ERROR
Hi, I have OCFS2 Cluster of 5 nodes running on RHEL 5.2 (kernel version 2.6.18-128.1.10.el5). I am getting error like Jun 24 09:26:54 alf2 kernel: (2095,0):o2net_connect_expired:1637 ERROR: no connection established with node 2 after 30.0 seco nds, giving up and returning errors. Jun 24 09:27:54 alf2 last message repeated 2 times Jun 24 09:29:24 alf2 last message repeated 3 times Jun 24 09:30:54 alf2 last message repeated 3 times Jun 24 09:32:24 alf2 last message repe...
2009 Jun 09
6
question about oracle shared home install
Hi All, Scenario: I'm trying to install 9i rac on a 2 node cluster on OCFS2 OS: Oracle enterprise linux To my understanding, OCFS2 supports shared home installs which to my knowledge is not only can i have datafile and control files but also clustermanager files and binaries (pretty much everything: no files or executables need to kept local to any nodes). I have one single shared file for
2011 Feb 10
0
(o2net, 6301, 0):o2net_connect_expired:1664 ERROR: no connection established with node 1 after 60.0 seconds, giving up and returning errors.
Hello, I am installing Two Node cluster when I automount the file systems I am getting o2net_connect_expired error and it is not mounting the cluster filesystems if I mount the cluster file systems manually as mount -a it is mounting the file systems without any issues. 1.If I bring Node1 up with Node2 to down cluster file system is automounting fine without any issues. 2.I checked the cluster.conf is s...
2010 Jan 14
1
another fencing question
...14 07:01:44 nvr1-rc kernel: (4007,4):dlm_send_proxy_ast_msg:458 ERROR: status = -112 Jan 14 07:01:44 nvr1-rc kernel: (4007,4):dlm_flush_asts:600 ERROR: status = -112 Jan 14 07:01:44 nvr1-rc kernel: (21534,1):dlm_get_lock_resource:917 ERROR: status = -112 Jan 14 07:02:19 nvr1-rc kernel: (3950,5):o2net_connect_expired:1664 ERROR: no connection established with node 0 after 35.0 seconds, giving up and returning errors. Jan 14 07:02:54 nvr1-rc kernel: (3950,5):o2net_connect_expired:1664 ERROR: no connection established with node 0 after 35.0 seconds, giving up and returning errors. Jan 14 07:03:10 nvr1-rc kern...
2009 Jul 22
2
OCFS2 Node restart
...now 1248267685.812715 dr 1248267655.816401 adv 1248267655.816401:1248267655.816401 func (0ffa2aed:502) 12 48267507.842160:1248267507.842160) -Jul 22 09:01:25 172.25.29.15 kernel: o2net: no longer connected to node alf3 (num 3) at 172.25.29.13:7777 -Jul 22 09:01:55 172.25.29.10 kernel: (2733,1):o2net_connect_expired:1667 ERROR: no connection established with node 3 after 3 0.0 seconds, giving up and returning errors. -Jul 22 09:01:55 172.25.29.15 kernel: (2541,0):o2net_connect_expired:1667 ERROR: no connection established with node 3 after 3 0.0 seconds, giving up and returning errors. How can I k...
2009 Jan 14
1
Transport endpoint is not connected while mounting....
...n Node 1: `mount.ocfs2 /dev/mapper/data /cluster/ data` I get this error after about 30 seconds: mount.ocfs2: Transport endpoint is not connected while mounting /dev/mapper/data on /cluster/ data. Check 'dmesg' for more information on this error. Here is the output of dmesg: (3130,1):o2net_connect_expired:1659 ERROR: no connection established with node 0 after 30.0 seconds, giving up and returning errors. (4670,1):dlm_request_join:1033 ERROR: status = -107 (4670,1):dlm_try_to_join_domain:1207 ERROR: status = -107 (4670,1):dlm_join_domain:1485 ERROR: status = -107 (4670,1):dlm_register_domain:1732...
2010 Oct 23
1
Reg: ocfs2 two node cluster crashed, node2 crashed, when I rebooted node1 for maintenance.
...48586.872368 now 1287848646.872227 dr 1287848586.872346 adv 1287848586.872376:1287848586.872376 func (fb860756 :513) 1287848578.874476:1287848578.874487) Oct 23 15:44:06 node2 kernel: o2net: no longer connected to node node1 (num 1) at 192.168.3.1:7777 Oct 23 15:45:06 node2 kernel: (o2net,14590,15):o2net_connect_expired:1664 ERROR: no connection established with node 1 after 60.0 seconds, giving up and returning errors. Oct 23 15:46:06 node2 kernel: (o2net,14590,15):o2net_connect_expired:1664 ERROR: no connection established with node 1 after 60.0 seconds, giving up and returning errors. Oct 23 15:51:34 node2 sy...
2009 Mar 18
2
shutdown by o2net_idle_timer causes Xen to hang
...rnel: o2net: no longer connected to node cod-2 (num 3) at 10.0.0.42:7777 Mar 15 14:39:47 ugc-1 kernel: (24452,0):dlm_do_master_request:1335 ERROR: link to 3 went down! Mar 15 14:39:47 ugc-1 kernel: (24452,0):dlm_get_lock_resource:912 ERROR: status = -112 Mar 15 14:40:17 ugc-1 kernel: (1743,0):o2net_connect_expired:1637 ERROR: no connection established with node 3 after 30.0 seconds, giving up and returning errors. Mar 15 14:44:29 ugc-1 kernel: (16225,0):dlm_do_master_request:1335 ERROR: link to 3 went down! Mar 15 14:44:29 ugc-1 kernel: (16225,0):dlm_get_lock_resource:912 ERROR: status = -107 Mar 15...
2009 Aug 10
1
error mounting ocfs2 mountpoints
...agnostics in /tmp/crsctl.11708. Aug 10 10:37:20 guard0 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.11694. Aug 10 10:37:20 guard0 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.11880. Aug 10 10:37:24 guard0 kernel: (17056,12):o2net_connect_expired:1585 ERROR: no connection established with node 1 after 60.0 seconds, giving up and returning errors. Aug 10 10:37:24 guard0 kernel: (17171,12):dlm_request_join:901 ERROR: status = -107 Aug 10 10:37:24 guard0 kernel: (17171,12):dlm_try_to_join_domain:1049 ERROR: status = -107 Aug 10 10:37:24 guard0...
2006 May 26
1
Another node is heartbeating in our slot!
...Device "sdb1": another node is heartbeating in our slot! (4159,0):o2hb_do_disk_heartbeat:962 ERROR: Device "sdb1": another node is heartbeating in our slot! (4159,0):o2hb_do_disk_heartbeat:962 ERROR: Device "sdb1": another node is heartbeating in our slot! (3257,0):o2net_connect_expired:1444 ERROR: no connection established with node 1 after 10 seconds, giving up and returning errors. (4157,1):dlm_request_join:786 ERROR: status = -107 (4157,1):dlm_try_to_join_domain:934 ERROR: status = -107 (4157,1):dlm_join_domain:1186 ERROR: status = -107 (4157,1):dlm_register_domain:1379 ER...
2009 Jul 29
3
Error message whil booting system
...124887 6927.861591 now 1248876957.858464 dr 1248876927.861556 adv 1248876927.861622:1248876927.861623 func (0ffa2aed:506) 1248876927 .861592:1248876927.861604) Jul 29 10:15:57 alf1 kernel: o2net: no longer connected to node alf3 (num 3) at 172.25.29.13:7777 Jul 29 10:16:27 alf1 kernel: (2600,1):o2net_connect_expired:1667 ERROR: no connection established with node 3 after 30.0 seco nds, giving up and returning errors. Jul 29 10:17:27 alf1 last message repeated 2 times Jul 29 10:17:30 alf1 kernel: (2618,0):ocfs2_dlm_eviction_cb:98 device (8,33): dlm has evicted node 3 Jul 29 10:17:32 alf1 kernel: (2629,2):dl...
2011 Mar 04
1
node eviction
...59624F7042EAB9829B18CA65FC88: recovery map is not empty, but must master $RECOVERY lock now Mar 3 16:18:04 xirisoas3 kernel: (23344,2):dlm_do_recovery:519 (23344) Node 3 is the Recovery Master for the Dead Node 2 for Domain 129859624F7042EAB9829B18CA65FC88 Mar 3 16:20:48 xirisoas3 kernel: (22790,2):o2net_connect_expired:1585 ERROR: no connection established with node 2 after 10.0 seconds, giving up and returning errors. Mar 3 16:20:59 xirisoas3 kernel: o2net: connected to node XIRISOAS2 (num 2) at 10.0.0.5:9999 Mar 3 16:20:59 xirisoas3 kernel: ocfs2_dlm: Node 2 joins domain 129859624F7042EAB9829B18CA65FC88 Mar 3 1...
2010 Apr 05
1
Kernel Panic, Server not coming back up
...= -5 (2881,0):ocfs2_expand_nonsparse_inode:1678 ERROR: status = -5 (2881,0):ocfs2_write_begin_nolock:1722 ERROR: status = -5 (2881,0):ocfs2_write_begin:1860 ERROR: status = -5 (2881,0):ocfs2_file_buffered_write:2039 ERROR: status = -5 (2881,0):__ocfs2_file_aio_write:2194 ERROR: status = -5 (2045,0):o2net_connect_expired:1664 ERROR: no connection established with node 2 after 30.0 seconds, giving up and returning errors. OCFS2: ERROR (device sdc1): ocfs2_check_group_descriptor: Group descriptor # 1128960 has bit count 32256 but claims that 34300 are free (2872,0):ocfs2_search_chain:1244 ERROR: status = -5 (2872,0):...
2010 Oct 20
1
OCFS2 + iscsi: another node is heartbeating in our slot (over scst)
...0):o2hb_do_disk_heartbeat:770 ERROR: Device "sda1": another node is heartbeating in our slot! Oct 19 22:21:02 storage kernel: [ 1510.428600] o2net: connection to node node-2 (num 0) at 192.168.1.69:7777 shutdown, state 7 . . Oct 19 22:21:32 storage kernel: [ 1540.448016] (o2net,4404,0):o2net_connect_expired:1659 ERROR: no connection established with node 0 after 30.0 seconds, giving up and returning errors. Oct 19 22:21:38 storage kernel: [ 1546.496143] (o2hb-2283B3335E,4427,0):o2hb_do_disk_heartbeat:770 ERROR: Device "sda1": another node is heartbeating in our slot! Fianlly, storage ser...
2006 Jan 09
0
[PATCH 01/11] ocfs2: event-driven quorum
...2nm_get_node_by_num(node_num); + + o2hb_notify(O2HB_CONN_UP_CB, node, node_num); + /* this is a bit of a hack. we only try reconnecting * when heartbeating starts until we get a connection. * if that connection then dies we don't try reconnecting. @@ -1424,13 +1427,6 @@ static void o2net_connect_expired(void * spin_unlock(&nn->nn_lock); } -static void o2net_still_up(void *arg) -{ - struct o2net_node *nn = arg; - - o2quo_hb_still_up(o2net_num_from_nn(nn)); -} - /* ------------------------------------------------------------ */ void o2net_disconnect_node(struct o2nm_node *node) @@ -1...
2007 Feb 06
2
Network 10 sec timeout setting?
Hello! Hey didnt a setting for the 10 second network timeout get into the 2.6.20 kernel? if so how do we set this? I am getting OCFS2 1.3.3 (2201,0):o2net_connect_expired:1547 ERROR: no connection established with node 1 after 10.0 seconds, giving up and returning errors. (2458,0):dlm_request_join:802 ERROR: status = -107 (2458,0):dlm_try_to_join_domain:950 ERROR: status = -107 (2458,0):dlm_join_domain:1202 ERROR: status = -107 (2458,0):dlm_register_domain:1393 ERRO...
2011 Dec 20
8
ocfs2 - Kernel panic on many write/read from both
Sorry i don`t copy everything: TEST-MAIL1# echo "ls //orphan_dir:0000"|debugfs.ocfs2 /dev/dm-0|wc debugfs.ocfs2 1.6.4 5239722 26198604 246266859 TEST-MAIL1# echo "ls //orphan_dir:0001"|debugfs.ocfs2 /dev/dm-0|wc debugfs.ocfs2 1.6.4 6074335 30371669 285493670 TEST-MAIL2 ~ # echo "ls //orphan_dir:0000"|debugfs.ocfs2 /dev/dm-0|wc debugfs.ocfs2 1.6.4 5239722 26198604
2008 Sep 18
2
o2hb_do_disk_heartbeat:982:ERROR
Hi everyone; I have a problem on my 10 nodes cluster with ocfs2 1.2.9 and the OS is RHEL 4.7 AS. 9 nodes can start o2cb service and mount san disks on startup however one node can not do that. My cluster configuration is : node: ip_port = 7777 ip_address = 192.168.5.1 number = 0 name = fa01 cluster = ocfs2 node: ip_port =
2009 Nov 20
3
o2net patch that avoids socket disconnect/reconnect
This fix modifies o2net layer behavior which seems to trigger some DLM race issues during umount/evictions that needs to be fixed as well. I am working on the dlm issues but meanwhile please review this patch. Thanks, --Srini