similar to: Failover testing problem and a heartbeat question

Displaying 20 results from an estimated 10000 matches similar to: "Failover testing problem and a heartbeat question"

2010 Jan 18
1
Getting Closer (was: Fencing options)
One more follow on, The combination of kernel.panic=60 and kernel.printk=7 4 1 7 seems to have netted the culrptit: E01-netconsole.log:Jan 18 09:45:10 E01 (10,0):o2hb_write_timeout:137 ERROR: Heartbeat write timeout to device dm-12 after 60000 milliseconds E01-netconsole.log:Jan 18 09:45:10 E01 (10,0):o2hb_stop_all_regions:1517 ERROR: stopping heartbeat on all active regions.
2007 Jul 29
1
6 node cluster with unexplained reboots
We just installed a new cluster with 6 HP DL380g5, dual single port Qlogic 24xx HBAs connected via two HP 4/16 Storageworks switches to a 3Par S400. We are using the 3Par recommended config for the Qlogic driver and device-mapper-multipath giving us 4 paths to the SAN. We do see some SCSI errors where DM-MP is failing a path after get a 0x2000 error from the SAN controller, but the path gets puts
2009 Aug 08
1
Heartbeat Timeout Threshold
I've been using OCFS2 on a 3 way Centos 5.2 Xen cluster for a while now using it to share the VM disk images. In this way I can have live and transparent VM migration. I'd been having intermittent (every 2-3 weeks) incidents where a server would self fence. After configuring netconsole I managed to see that the fencing was due to a heartbeat threshold timeout so I have now increased
2009 Sep 24
1
strange fencing behavior
I have 10 servers in a cluster running Debian Etch with 2.6.26-bpo.2 with a backport of ocfs2-tools-1.4.1-1 I'm using AoE to export the drives from a Debian Lenny server in the cluster. My problem is if I mount the ocfs2 partition on the server that is exporting it via AoE it fences the entire cluster. Looking at the logs exporting the ocfs2 partition doesn't give much information...
2007 Feb 06
2
Network 10 sec timeout setting?
Hello! Hey didnt a setting for the 10 second network timeout get into the 2.6.20 kernel? if so how do we set this? I am getting OCFS2 1.3.3 (2201,0):o2net_connect_expired:1547 ERROR: no connection established with node 1 after 10.0 seconds, giving up and returning errors. (2458,0):dlm_request_join:802 ERROR: status = -107 (2458,0):dlm_try_to_join_domain:950 ERROR: status = -107
2006 Nov 03
2
Newbie questions -- is OCFS2 what I even want?
Dear Sirs and Madams, I run a small visual effects production company, Hammerhead Productions. We'd like to have an easily extensible inexpensive relatively high-performance storage network using open-source components. I was hoping that OCFS2 would be that system. I have a half-dozen 2 TB fileservers I'd like the rest of the network to see as a single 12 TB disk, with the aggregate
2008 Mar 05
3
cluster with 2 nodes - heartbeat problem fencing
Hi to all, this is My first time on this mailinglist. I have a problem with Ocfs2 on Debian etch 4.0 I'd like when a node go down or freeze without unmount the ocfs2 partition the heartbeat not fence the server that work well ( kernel panic ). I'd like disable or heartbeat or fencing. So we can work also with only 1 node. Thanks
2008 Jul 01
5
ocfs2 fencing problem
Hi, Sunil or Tao, I have a 4 nodes OCFS2 cluster running OCFS2 1.2.8 on SuSE 9 SP4. When I tried to do failover testing (shutting down one node), the whole cluster hung (I can not even login to any server in the cluster). I have to bring all of them up and then be able to use the system. What kind of behavior is it? Is it the fence of OCFS2? Below is my configuration. aopcer13:~ #
2009 Nov 13
1
Cannot set heartbeat dead threshold
Hi I have: SLES 10 SP2 (2.6.16.60-0.21-smp) ocfs2-tools-1.4.0-0.3 ocfs2console-1.4.0-0.3 and I can't change "heartbeat dead threshold" value. Content of /etc/sysconfig/o2cb: # O2CB_ENABLED: 'true' means to load the driver on boot. O2CB_ENABLED=true # O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start. O2CB_BOOTCLUSTER=ocfs2 # O2CB_HEARTBEAT_THRESHOLD:
2009 Aug 13
1
Shutdown to single user mode causes SysRq Reset
Hello, I've got a 2 node HP DL580 cluster supported by a Fibrechannel SAN with dual FC cards, dual switches and an HP EVA on the back end.? All SAN disks are multipathed.? Installed software is: Redhat 5.3 ocfs2-2.6.18-128.1.14.el5-1.4.2-1.el5 ocfs2-tools-1.4.2-1.el5 ocfs2console-1.4.2-1.el5 Oracle RAC 11g ASM Oracle RAC 11g Clusterware Oracle RAC 10g databases OCFS2 isn't being used by
2009 Jul 29
3
Error message whil booting system
Hi, When system booting getting error message "modprobe: FATAL: Module ocfs2_stackglue not found" in message. Some nodes reboot without any error message. ------------------------------------------------- ul 27 10:02:19 alf3 kernel: ip_tables: (C) 2000-2006 Netfilter Core Team Jul 27 10:02:19 alf3 kernel: Netfilter messages via NETLINK v0.30. Jul 27 10:02:19 alf3 kernel:
2011 Mar 04
1
node eviction
Hello... I wonder if someone have had similar problem like this... a node evicts almost in a weekly basis and I have not found the root cause yet.... Mar 2 10:20:57 xirisoas3 kernel: ocfs2_dlm: Node 1 joins domain 129859624F7042EAB9829B18CA65FC88 Mar 2 10:20:57 xirisoas3 kernel: ocfs2_dlm: Nodes in domain ("129859624F7042EAB9829B18CA65FC88"): 1 2 3 4 Mar 3 16:18:02 xirisoas3 kernel:
2009 Jun 24
3
Unexplained reboots in DRBD82 + OCFS2 setup
We're trying to setup a dual-primary DRBD environment, with a shared disk with either OCFS2 or GFS. The environment is a Centos 5.3 with DRBD82 (but also tried with DRBD83 from testing) . Setting up a single primary disk and running bonnie++ on it works. Setting up a dual-primary disk, only mounting it on one node (ext3) and running bonnie++ works When setting up ocfs2 on the /dev/drbd0
2009 Jun 24
3
Unexplained reboots in DRBD82 + OCFS2 setup
We're trying to setup a dual-primary DRBD environment, with a shared disk with either OCFS2 or GFS. The environment is a Centos 5.3 with DRBD82 (but also tried with DRBD83 from testing) . Setting up a single primary disk and running bonnie++ on it works. Setting up a dual-primary disk, only mounting it on one node (ext3) and running bonnie++ works When setting up ocfs2 on the /dev/drbd0
2008 Sep 10
4
mount.ocfs2: Error when attempting to run /sbin/ocfs2_hb_ctl: "Operation not permitted".
Hi, I am trying to configure a two node cluster on SLES10SP2 using user level heartbeat. Here is my configuration. ocfs2-tools-1.4.0-0.3 **user level heartbeat** -> lsmod | grep ocfs ocfs2_user_heartbeat 20992 1 ocfs2_dlmfs 37776 1 ocfs2_dlm 204456 1 ocfs2_dlmfs ocfs2_nodemanager 223384 6 ocfs2_user_heartbeat,ocfs2_dlmfs,ocfs2_dlm configfs 44700 3 ocfs2_user_heartbeat,ocfs2_nodemanager
2005 Jul 12
1
problem mounting ocfs2: heartbeat
When attempting to mount the OCFS2 file system I'm getting the following error message: ocfs2_hb_ctl: Internal logic failure while starting heartbeat mount.ocfs2: Error when attempting to run /sbin/ocfs2_hb_ctl: "Operation not permitted" I followed the steps given in the users_guide: modprobe ocfs2_dlmfs mount -t configfs none /config mount -t ocfs2_dlmfs none /dlm o2cb_ctl
2005 Sep 23
5
ocfs2 <-> 10G (10.2.01) Clusterware
RHEL 4 (CENT OS) Am I waisting my time trying to get the 10G Clusterware installer to use OCFS2 volumes for the voting and OCR disks ? The ocfs2 setup seems happy on both nodes but the 10G installer says the location entered for the oracle cluster registry (OCR) is not shared across all the nodes in the cluster Do the volumes need to mounted ? I did with no change . [root@green rc5.d]#
2006 Aug 01
1
AW: ocfs2_search_chain: Group Descriptor has bad signature
I'm using ocfs2 and all modules from Suse (SLES9), no self compilations. Here are the details: * 32-bit machine (writing to ocfs2 partition/LUN and where the corruption was reported): Kernel: 2.6.5-7.257-bigsmp #1 SMP i686 i386 GNU/Linux OCFS2 rpms: ocfs2console-1.2.1-4.2 ocfs2-tools-1.2.1-4.2 o2cb_ctl -V: o2cb_ctl version 1.2.1 /etc/init.d/o2cb status: Module "configfs":
2005 Oct 12
2
Unable to access cluster service
hello, I'm running Ubuntu Breezy with the OCFS2 modules in the standard kernel. I installed ocfs2console and ocfs2-tools I've formatted a partition with ocfs2. But I can't add any node or mount the device(with the ocfs2console). because I get a "Unable to access cluster service" I can't find the cause nor the solution to this. root@lenaeja:~# /etc/init.d/o2cb status
2006 Apr 18
1
Self-fencing issues (RHEL4)
Hi. I'm running RHEL4 for my test system, Adaptec Firewire controllers, Maxtor One Touch III shared disk (see the details below), 100Mb/s dedicated interconnect. It panics with no load about each 20 minutes (error message from netconsole attached) Any clues? Yegor --- [root at rac1 ~]# cat /proc/fs/ocfs2/version OCFS2 1.2.0 Tue Mar 7 15:51:20 PST 2006 (build