g.digiambelardini@fabaris.it
2008-Mar-05 05:02 UTC
[Ocfs-users] cluster with 2 nodes - heartbeat problem fencing
Hi to all, this is My first time on this mailinglist. I have a problem with Ocfs2 on Debian etch 4.0 I'd like when a node go down or freeze without unmount the ocfs2 partition the heartbeat not fence the server that work well ( kernel panic ). I'd like disable or heartbeat or fencing. So we can work also with only 1 node. Thanks
g.digiambelardini@fabaris.it
2008-Mar-05 06:39 UTC
[Ocfs-users] cluster with 2 nodes - heartbeat problem fencing
Hi,
now the problem is different,
this is My cluster.conf:
----------------------------------------------------------
node:
ip_port = 7777
ip_address = 1.1.1.1
number = 0
name = virtual1
cluster = ocfs2
node:
ip_port = 7777
ip_address = 1.1.1.2
number = 1
name = virtual2
cluster = ocfs2
cluster:
node_count = 2
name = ocfs2
-----------------------------------------------------
now seems the one of the cluster is a master, or better the virtual1 is a
master, so when we shutdown the heartbeat interface ( eth0 - with partition
mounted ) on the virtual1, the virtual2 gone in kernel panic. Instead if we
shutdown the eth0 on virtual2, virtual1 work well.
some body can help us?
obviously if we reboot any server, so the partition gone unmounted before
network gone down, avery thing work well.
THANKS
-----ocfs-users-bounces@oss.oracle.com wrote: -----
To: ocfs-users@oss.oracle.com
From: g.digiambelardini@fabaris.it
Sent by: ocfs-users-bounces@oss.oracle.com
Date: 05/03/2008 13.51
Subject: [Ocfs-users] cluster with 2 nodes - heartbeat problem fencing
Hi to all, this is My first time on this mailinglist.
I have a problem with Ocfs2 on Debian etch 4.0
I'd like when a node go down or freeze without unmount the ocfs2 partition
the heartbeat not fence the server that work well ( kernel panic ).
I'd like disable or heartbeat or fencing. So we can work also with only 1
node.
Thanks
_______________________________________________
Ocfs-users mailing list
Ocfs-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs-users
g.digiambelardini@fabaris.it
2008-Mar-06 06:30 UTC
[Ocfs-users] cluster with 2 nodes - heartbeat problem fencing
Hi thanks for your help. We read your link, and we tried many solutions, but nothings work well for us. The situation is that when we stop the eth link con the server have node number = 0 ( virtual1 ) and shared partition is mounted, we can't for some second umount manually the partition ( or shutdown the server ) before the node 2 go in kernel panic ( the partition seems locked ). this is our /etc/default/o2cb: # O2CB_ENABLED: 'true' means to load the driver on boot. O2CB_ENABLED=true # O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start. O2CB_BOOTCLUSTER=ocfs2 # O2CB_HEARTBEAT_THRESHOLD: Iterations before a node is considered dead. O2CB_HEARTBEAT_THRESHOLD=30 # O2CB_IDLE_TIMEOUT_MS: Time in ms before a network connection is considered dead. O2CB_IDLE_TIMEOUT_MS=50000 # O2CB_KEEPALIVE_DELAY_MS: Max. time in ms before a keepalive packet is sent. O2CB_KEEPALIVE_DELAY_MS=5000 # O2CB_RECONNECT_DELAY_MS: Min. time in ms between connection attempts. O2CB_RECONNECT_DELAY_MS=5000 ----------------------------------------------------------------------- We tried to change many times the value but nothing to do. I think the most easy way is stop heartbeat, but we can success to do it. HELP ME -----Sunil Mushran <Sunil.Mushran@oracle.com> wrote: ----- To: g.digiambelardini@fabaris.it From: Sunil Mushran <Sunil.Mushran@oracle.com> Date: 05/03/2008 18.55 cc: ocfs-users@oss.oracle.com Subject: Re: [Ocfs-users] cluster with 2 nodes - heartbeat problem fencing http://oss.oracle.com/projects/ocfs2/dist/documentation/ocfs2_faq.html#QUORUM g.digiambelardini@fabaris.it wrote:> Hi, > now the problem is different, > this is My cluster.conf: > > ---------------------------------------------------------- > node: > ip_port = 7777 > ip_address = 1.1.1.1 > number = 0 > name = virtual1 > cluster = ocfs2 > > node: > ip_port = 7777 > ip_address = 1.1.1.2 > number = 1 > name = virtual2 > cluster = ocfs2 > > cluster: > node_count = 2 > name = ocfs2 > ----------------------------------------------------- > now seems the one of the cluster is a master, or better the virtual1 is a > master, so when we shutdown the heartbeat interface ( eth0 - withpartition> mounted ) on the virtual1, the virtual2 gone in kernel panic. Instead ifwe> shutdown the eth0 on virtual2, virtual1 work well. > some body can help us? > obviously if we reboot any server, so the partition gone unmounted before > network gone down, avery thing work well. > THANKS > > > > > -----ocfs-users-bounces@oss.oracle.com wrote: ----- > > To: ocfs-users@oss.oracle.com > From: g.digiambelardini@fabaris.it > Sent by: ocfs-users-bounces@oss.oracle.com > Date: 05/03/2008 13.51 > Subject: [Ocfs-users] cluster with 2 nodes - heartbeat problem fencing > > > > Hi to all, this is My first time on this mailinglist. > I have a problem with Ocfs2 on Debian etch 4.0 > I'd like when a node go down or freeze without unmount the ocfs2partition> the heartbeat not fence the server that work well ( kernel panic ). > I'd like disable or heartbeat or fencing. So we can work also with only 1 > node. > Thanks > > > _______________________________________________ > Ocfs-users mailing list > Ocfs-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs-users > > > _______________________________________________ > Ocfs-users mailing list > Ocfs-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs-users >
g.digiambelardini@fabaris.it
2008-Mar-07 01:26 UTC
[Ocfs-users] cluster with 2 nodes - heartbeat problem fencing
Hi, My real problem is that if the communication link between the two node breaks, specially on node 0, the node 1 shouldn't to be fence, beacause work well, but it would be more normal fence node 0 not node 1. for this reason i'd like stop heartbeat, so averythink work well, but exist a method for do it? -----Sunil Mushran <Sunil.Mushran@oracle.com> wrote: ----- To: g.digiambelardini@fabaris.it From: Sunil Mushran <Sunil.Mushran@oracle.com> Date: 06/03/2008 19.13 cc: ocfs-users@oss.oracle.com Subject: Re: [Ocfs-users] cluster with 2 nodes - heartbeat problem fencing What that note says is that in a 2 node setup, if the communication link between the two node breaks, the higher node number will be fenced. In your case, you are shutting down the network on node 0. The cluster stack sees this as a comm link down between the two nodes. At this stage, even if you do a umount vol on node 1, node 1 will still have node 0 in its domain and will want to ping it to migrate lockres', leave the domain, etc. As in, umount is a clusterwide event and not an isolated one. Forcibly shutting down hb won't work because the vol is still mounted and all those inodes are still cached and maybe still in use. I am unclear as to what your real problem is. Sunil g.digiambelardini@fabaris.it wrote:> Hi thanks for your help. > We read your link, and we tried many solutions, but nothings work wellfor> us. > The situation is that when we stop the eth link con the server have node > number = 0 ( virtual1 ) and shared partition is mounted, we can't forsome> second umount manually the partition ( or shutdown the server ) beforethe> node 2 go in kernel panic ( the partition seems locked ). > > this is our /etc/default/o2cb: > > # O2CB_ENABLED: 'true' means to load the driver on boot. > O2CB_ENABLED=true > > # O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start. > O2CB_BOOTCLUSTER=ocfs2 > > # O2CB_HEARTBEAT_THRESHOLD: Iterations before a node is considered dead. > O2CB_HEARTBEAT_THRESHOLD=30 > > # O2CB_IDLE_TIMEOUT_MS: Time in ms before a network connection is > considered dead. > O2CB_IDLE_TIMEOUT_MS=50000 > > # O2CB_KEEPALIVE_DELAY_MS: Max. time in ms before a keepalive packet is > sent. > O2CB_KEEPALIVE_DELAY_MS=5000 > > # O2CB_RECONNECT_DELAY_MS: Min. time in ms between connection attempts. > O2CB_RECONNECT_DELAY_MS=5000 > ----------------------------------------------------------------------- > We tried to change many times the value but nothing to do. > > I think the most easy way is stop heartbeat, but we can success to do it. > > HELP ME > > > > > > > > > -----Sunil Mushran <Sunil.Mushran@oracle.com> wrote: ----- > > To: g.digiambelardini@fabaris.it > From: Sunil Mushran <Sunil.Mushran@oracle.com> > Date: 05/03/2008 18.55 > cc: ocfs-users@oss.oracle.com > Subject: Re: [Ocfs-users] cluster with 2 nodes - heartbeat problemfencing> >http://oss.oracle.com/projects/ocfs2/dist/documentation/ocfs2_faq.html#QUORUM> > > g.digiambelardini@fabaris.it wrote: > >> Hi, >> now the problem is different, >> this is My cluster.conf: >> >> ---------------------------------------------------------- >> node: >> ip_port = 7777 >> ip_address = 1.1.1.1 >> number = 0 >> name = virtual1 >> cluster = ocfs2 >> >> node: >> ip_port = 7777 >> ip_address = 1.1.1.2 >> number = 1 >> name = virtual2 >> cluster = ocfs2 >> >> cluster: >> node_count = 2 >> name = ocfs2 >> ----------------------------------------------------- >> now seems the one of the cluster is a master, or better the virtual1 isa>> master, so when we shutdown the heartbeat interface ( eth0 - with >> > partition > >> mounted ) on the virtual1, the virtual2 gone in kernel panic. Instead if >> > we > >> shutdown the eth0 on virtual2, virtual1 work well. >> some body can help us? >> obviously if we reboot any server, so the partition gone unmountedbefore>> network gone down, avery thing work well. >> THANKS >> >> >> >> >> -----ocfs-users-bounces@oss.oracle.com wrote: ----- >> >> To: ocfs-users@oss.oracle.com >> From: g.digiambelardini@fabaris.it >> Sent by: ocfs-users-bounces@oss.oracle.com >> Date: 05/03/2008 13.51 >> Subject: [Ocfs-users] cluster with 2 nodes - heartbeat problem fencing >> >> >> >> Hi to all, this is My first time on this mailinglist. >> I have a problem with Ocfs2 on Debian etch 4.0 >> I'd like when a node go down or freeze without unmount the ocfs2 >> > partition > >> the heartbeat not fence the server that work well ( kernel panic ). >> I'd like disable or heartbeat or fencing. So we can work also with only1>> node. >> Thanks >> >> >> _______________________________________________ >> Ocfs-users mailing list >> Ocfs-users@oss.oracle.com >> http://oss.oracle.com/mailman/listinfo/ocfs-users >> >> >> _______________________________________________ >> Ocfs-users mailing list >> Ocfs-users@oss.oracle.com >> http://oss.oracle.com/mailman/listinfo/ocfs-users >> >> > > >