Displaying 20 results from an estimated 10000 matches similar to: "2 node cluster, Heartbeat2, self-fencing"
2008 Jul 01
5
ocfs2 fencing problem
Hi, Sunil or Tao,
I have a 4 nodes OCFS2 cluster running OCFS2 1.2.8 on SuSE 9 SP4. When I
tried to do failover testing (shutting down one node), the whole cluster
hung (I can not even login to any server in the cluster). I have to
bring all of them up and then be able to use the system. What kind of
behavior is it? Is it the fence of OCFS2? Below is my configuration.
aopcer13:~ #
2007 Dec 01
1
Good tutorial about using heartbeat2, ocfs2 and evms with xen 3.x
Hi all
Sombedody can points me to a good tutorial about using high availabilty
clusters with xen using heratbeat2, ocfs2 and evms under rhel/centos, debian or
sles??
I am doing various searches without a result ... (google shows me a lot of
references, mailing lists, etc but not a good doc)
Many thanks.
--
CL Martinez
carlopmart {at} gmail {d0t} com
2008 Mar 05
3
cluster with 2 nodes - heartbeat problem fencing
Hi to all, this is My first time on this mailinglist.
I have a problem with Ocfs2 on Debian etch 4.0
I'd like when a node go down or freeze without unmount the ocfs2 partition
the heartbeat not fence the server that work well ( kernel panic ).
I'd like disable or heartbeat or fencing. So we can work also with only 1
node.
Thanks
2010 Dec 09
1
[PATCH] Call userspace script when self-fencing
Hi,
According to comments in file fs/ocfs2/cluster/quorum.c:70 about the
self-fencing operations :
/* It should instead flip the file
?* system RO and call some userspace script. */
So, I tried to add it (but i did'nt find a way to flip the fs in RO).
Here is a proposal for this functionnality, based on ocfs2-1.4.7.
This patch add an entry 'fence_cmd' in /sys to specify an
2010 Nov 02
1
disabling self-fencing
Hi all,
on my nodes I have another cluster manager running that takes care of
fencing the node, is it safe to disable ocfs2 self-fencing?
Details:
I'm using ocfs2 over drbd in master/master aka primary/primary mode.
In case of loss of network connectivity I would like to disconnect the
drbd device, invalidate it and unmount the filesystem but ocfs2
reboots the node....
Is it possible to
2006 Jul 28
3
Private Interconnect and self fencing
I have an OCFS2 filesystem on a coraid AOE device.
It mounts fine, but with heavy I/O the server self fences claiming a
write timeout:
(16,2):o2hb_write_timeout:164 ERROR: Heartbeat write timeout to device
etherd/e0.1p1 after 12000 milliseconds
(16,2):o2hb_stop_all_regions:1789 ERROR: stopping heartbeat on all
active regions.
Kernel panic - not syncing: ocfs2 is very sorry to be fencing this
2007 Sep 04
3
Ocfs2 and debian
Hi.
I'm pretty new to ocfs2 and clusters.
I'm trying to make ocfs2 running over a drbd device.
I know it's not the best solution but for now i must deal with this.
I set up drbd and work perfectly.
I set up ocfs and i'm not able to make it to work.
/etc/init.d/o2cb status:
Module "configfs": Loaded
Filesystem "configfs": Mounted
Module
2006 Apr 18
1
Self-fencing issues (RHEL4)
Hi.
I'm running RHEL4 for my test system, Adaptec Firewire controllers,
Maxtor One Touch III shared disk (see the details below),
100Mb/s dedicated interconnect. It panics with no load about each
20 minutes (error message from netconsole attached)
Any clues?
Yegor
---
[root at rac1 ~]# cat /proc/fs/ocfs2/version
OCFS2 1.2.0 Tue Mar 7 15:51:20 PST 2006 (build
2007 Mar 13
3
OCFSv2 in a mail cluster environment
Hello,
First, thanks to all the people who helped to license this software under
the GPL. This is a very important piece of work for the Free Software in the
Enterprise Market.
Just a few newbie questions. We've been thinking about implmenting a mail
cluster with a Fiber SAN (IBM DS-4000) and OCFSv2 as the storage
backend. The information on this volume will essentially be Postfix
Maildirs,
2009 Nov 17
1
[PATCH 1/1] ocfs2/cluster: Make fence method configurable
By default, o2cb fences the box by calling emergency_restart(). While this
scheme works well in production, it comes in the way during testing as it
does not let the tester take stack/core dumps for analysis.
This patch allows user to dynamically change the fence method to panic() by:
# echo "panic" > /sys/kernel/config/cluster/<clustername>/fence_method
Signed-off-by: Sunil
2009 Oct 20
1
Fencing OCFS
I people, i install the ocfs in my virtual machine, with centos 5.3 and xen.
But when i turn off the machine1, the ocfs start the fencing
off the machine2. I read the doc in oracle.com, but could not solve the
problem. Someone help?
My conf and my package version.
cluster.conf
node:
ip_port = 7777
ip_address = 192.168.1.35
number = 0
name = x1
cluster
2008 Jul 14
1
Node fence on RHEL4 machine running 1.2.8-2
Hello,
We have a four-node RHEL4 RAC cluster running OCFS2 version 1.2.8-2 and
the 2.6.9-67.0.4hugemem kernel. The cluster has been really stable since
we upgraded to 1.2.8-2 early this year, but this morning, one of the
nodes fenced and rebooted itself, and I wonder if anyone could glance at
the below remote syslogs and offer an opinion as to why.
First, here's the output of
2010 Jan 18
1
Getting Closer (was: Fencing options)
One more follow on,
The combination of kernel.panic=60 and kernel.printk=7 4 1 7 seems to
have netted the culrptit:
E01-netconsole.log:Jan 18 09:45:10 E01 (10,0):o2hb_write_timeout:137
ERROR: Heartbeat write timeout to device dm-12 after 60000
milliseconds
E01-netconsole.log:Jan 18 09:45:10 E01
(10,0):o2hb_stop_all_regions:1517 ERROR: stopping heartbeat on all
active regions.
2009 Sep 24
1
strange fencing behavior
I have 10 servers in a cluster running Debian Etch with 2.6.26-bpo.2
with a backport of ocfs2-tools-1.4.1-1
I'm using AoE to export the drives from a Debian Lenny server in the
cluster.
My problem is if I mount the ocfs2 partition on the server that is
exporting it via AoE it fences the entire cluster. Looking at the logs
exporting the ocfs2 partition doesn't give much information...
2009 Jun 04
2
OCFS2 v1.4 hangs
I have four database servers in a high-availability, load-balancing
configuration. Each machine has a mount to a common data source which is
an OCFS2 v1.4 file-system. While working on three of the servers, I
restarted the IP network and found after-wards the fourth machine hung.
I could not reboot and could not unmount the ocfs2 partitions. I am
pretty sure this was all caused by my taking down
2007 Feb 06
2
Network 10 sec timeout setting?
Hello!
Hey didnt a setting for the 10 second network timeout get into the
2.6.20 kernel?
if so how do we set this?
I am getting
OCFS2 1.3.3
(2201,0):o2net_connect_expired:1547 ERROR: no connection established
with node 1 after 10.0 seconds, giving up and returning errors.
(2458,0):dlm_request_join:802 ERROR: status = -107
(2458,0):dlm_try_to_join_domain:950 ERROR: status = -107
2010 May 26
1
Failover testing problem and a heartbeat question
We have a setup with 15 hosts fibre attached via a switch to a common SAN. Each host has a single fibre port, the SAN has two controllers each with two ports. The SAN is exposing four OCFS2 v1.4.2 volumes. While performing a failover test, we observed 8 hosts fence and 2 reboot _without_ fencing. The OCFS2 FAQ recommends a default disk heartbeat of 31 - 61 loops for multipath io users. Our initial
2007 Sep 30
4
Question about increasing node slots
We have a test 10gR2 RAC cluster using ocfs2 filesystems for the
Clusterware files and the Database files.
We need to increase the node slots to accomodate new RAC nodes. Is it
true that we will need to umount these filesystems for the upgrade (i.e.
Database and Clusterware also)?
We are planning to use the following command format to perform the node
slot increase:
# tunefs.ocfs2 ?N 3
2009 Feb 12
2
mount point is not unique among all nodes
Hi list,
Here is a bug report on novell bugzilla (https://bugzilla.novell.com/show_bug.cgi?id=456280) that
mount point inside node A can be removed from node B.
The problem is, node B does not know an empty dir is be using as mount point on another node. Is
there any solution to return -EBUSY when a dir is be using as mount point on another node ?
Thanks in advance.
--
Coly Li
SuSE Labs
2008 Oct 17
1
Preventing DomU corruption in case of Split-Brain of heartbeat
Hi Xen-Users!
We run an large HA XEN system based on heartbeat2.
Storage base is an infiniband storage cluster exporting iSCSI devices
to the frontend HA XEN Machines. The iSCSI devices are used as pysical
devices
for the domUs using the block-iscsi mechanism (by the way thanks for
this cool script).
Recently we had a split brain in our heartbeat system. This causes both
of our XEN servers to