g.digiambelardini@fabaris.it
2008-Mar-05 05:02 UTC
[Ocfs-users] cluster with 2 nodes - heartbeat problem fencing
Hi to all, this is My first time on this mailinglist. I have a problem with Ocfs2 on Debian etch 4.0 I'd like when a node go down or freeze without unmount the ocfs2 partition the heartbeat not fence the server that work well ( kernel panic ). I'd like disable or heartbeat or fencing. So we can work also with only 1 node. Thanks
g.digiambelardini@fabaris.it
2008-Mar-05 06:39 UTC
[Ocfs-users] cluster with 2 nodes - heartbeat problem fencing
Hi, now the problem is different, this is My cluster.conf: ---------------------------------------------------------- node: ip_port = 7777 ip_address = 1.1.1.1 number = 0 name = virtual1 cluster = ocfs2 node: ip_port = 7777 ip_address = 1.1.1.2 number = 1 name = virtual2 cluster = ocfs2 cluster: node_count = 2 name = ocfs2 ----------------------------------------------------- now seems the one of the cluster is a master, or better the virtual1 is a master, so when we shutdown the heartbeat interface ( eth0 - with partition mounted ) on the virtual1, the virtual2 gone in kernel panic. Instead if we shutdown the eth0 on virtual2, virtual1 work well. some body can help us? obviously if we reboot any server, so the partition gone unmounted before network gone down, avery thing work well. THANKS -----ocfs-users-bounces@oss.oracle.com wrote: ----- To: ocfs-users@oss.oracle.com From: g.digiambelardini@fabaris.it Sent by: ocfs-users-bounces@oss.oracle.com Date: 05/03/2008 13.51 Subject: [Ocfs-users] cluster with 2 nodes - heartbeat problem fencing Hi to all, this is My first time on this mailinglist. I have a problem with Ocfs2 on Debian etch 4.0 I'd like when a node go down or freeze without unmount the ocfs2 partition the heartbeat not fence the server that work well ( kernel panic ). I'd like disable or heartbeat or fencing. So we can work also with only 1 node. Thanks _______________________________________________ Ocfs-users mailing list Ocfs-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs-users
g.digiambelardini@fabaris.it
2008-Mar-06 06:30 UTC
[Ocfs-users] cluster with 2 nodes - heartbeat problem fencing
Hi thanks for your help. We read your link, and we tried many solutions, but nothings work well for us. The situation is that when we stop the eth link con the server have node number = 0 ( virtual1 ) and shared partition is mounted, we can't for some second umount manually the partition ( or shutdown the server ) before the node 2 go in kernel panic ( the partition seems locked ). this is our /etc/default/o2cb: # O2CB_ENABLED: 'true' means to load the driver on boot. O2CB_ENABLED=true # O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start. O2CB_BOOTCLUSTER=ocfs2 # O2CB_HEARTBEAT_THRESHOLD: Iterations before a node is considered dead. O2CB_HEARTBEAT_THRESHOLD=30 # O2CB_IDLE_TIMEOUT_MS: Time in ms before a network connection is considered dead. O2CB_IDLE_TIMEOUT_MS=50000 # O2CB_KEEPALIVE_DELAY_MS: Max. time in ms before a keepalive packet is sent. O2CB_KEEPALIVE_DELAY_MS=5000 # O2CB_RECONNECT_DELAY_MS: Min. time in ms between connection attempts. O2CB_RECONNECT_DELAY_MS=5000 ----------------------------------------------------------------------- We tried to change many times the value but nothing to do. I think the most easy way is stop heartbeat, but we can success to do it. HELP ME -----Sunil Mushran <Sunil.Mushran@oracle.com> wrote: ----- To: g.digiambelardini@fabaris.it From: Sunil Mushran <Sunil.Mushran@oracle.com> Date: 05/03/2008 18.55 cc: ocfs-users@oss.oracle.com Subject: Re: [Ocfs-users] cluster with 2 nodes - heartbeat problem fencing http://oss.oracle.com/projects/ocfs2/dist/documentation/ocfs2_faq.html#QUORUM g.digiambelardini@fabaris.it wrote:> Hi, > now the problem is different, > this is My cluster.conf: > > ---------------------------------------------------------- > node: > ip_port = 7777 > ip_address = 1.1.1.1 > number = 0 > name = virtual1 > cluster = ocfs2 > > node: > ip_port = 7777 > ip_address = 1.1.1.2 > number = 1 > name = virtual2 > cluster = ocfs2 > > cluster: > node_count = 2 > name = ocfs2 > ----------------------------------------------------- > now seems the one of the cluster is a master, or better the virtual1 is a > master, so when we shutdown the heartbeat interface ( eth0 - withpartition> mounted ) on the virtual1, the virtual2 gone in kernel panic. Instead ifwe> shutdown the eth0 on virtual2, virtual1 work well. > some body can help us? > obviously if we reboot any server, so the partition gone unmounted before > network gone down, avery thing work well. > THANKS > > > > > -----ocfs-users-bounces@oss.oracle.com wrote: ----- > > To: ocfs-users@oss.oracle.com > From: g.digiambelardini@fabaris.it > Sent by: ocfs-users-bounces@oss.oracle.com > Date: 05/03/2008 13.51 > Subject: [Ocfs-users] cluster with 2 nodes - heartbeat problem fencing > > > > Hi to all, this is My first time on this mailinglist. > I have a problem with Ocfs2 on Debian etch 4.0 > I'd like when a node go down or freeze without unmount the ocfs2partition> the heartbeat not fence the server that work well ( kernel panic ). > I'd like disable or heartbeat or fencing. So we can work also with only 1 > node. > Thanks > > > _______________________________________________ > Ocfs-users mailing list > Ocfs-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs-users > > > _______________________________________________ > Ocfs-users mailing list > Ocfs-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs-users >
g.digiambelardini@fabaris.it
2008-Mar-07 01:26 UTC
[Ocfs-users] cluster with 2 nodes - heartbeat problem fencing
Hi, My real problem is that if the communication link between the two node breaks, specially on node 0, the node 1 shouldn't to be fence, beacause work well, but it would be more normal fence node 0 not node 1. for this reason i'd like stop heartbeat, so averythink work well, but exist a method for do it? -----Sunil Mushran <Sunil.Mushran@oracle.com> wrote: ----- To: g.digiambelardini@fabaris.it From: Sunil Mushran <Sunil.Mushran@oracle.com> Date: 06/03/2008 19.13 cc: ocfs-users@oss.oracle.com Subject: Re: [Ocfs-users] cluster with 2 nodes - heartbeat problem fencing What that note says is that in a 2 node setup, if the communication link between the two node breaks, the higher node number will be fenced. In your case, you are shutting down the network on node 0. The cluster stack sees this as a comm link down between the two nodes. At this stage, even if you do a umount vol on node 1, node 1 will still have node 0 in its domain and will want to ping it to migrate lockres', leave the domain, etc. As in, umount is a clusterwide event and not an isolated one. Forcibly shutting down hb won't work because the vol is still mounted and all those inodes are still cached and maybe still in use. I am unclear as to what your real problem is. Sunil g.digiambelardini@fabaris.it wrote:> Hi thanks for your help. > We read your link, and we tried many solutions, but nothings work wellfor> us. > The situation is that when we stop the eth link con the server have node > number = 0 ( virtual1 ) and shared partition is mounted, we can't forsome> second umount manually the partition ( or shutdown the server ) beforethe> node 2 go in kernel panic ( the partition seems locked ). > > this is our /etc/default/o2cb: > > # O2CB_ENABLED: 'true' means to load the driver on boot. > O2CB_ENABLED=true > > # O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start. > O2CB_BOOTCLUSTER=ocfs2 > > # O2CB_HEARTBEAT_THRESHOLD: Iterations before a node is considered dead. > O2CB_HEARTBEAT_THRESHOLD=30 > > # O2CB_IDLE_TIMEOUT_MS: Time in ms before a network connection is > considered dead. > O2CB_IDLE_TIMEOUT_MS=50000 > > # O2CB_KEEPALIVE_DELAY_MS: Max. time in ms before a keepalive packet is > sent. > O2CB_KEEPALIVE_DELAY_MS=5000 > > # O2CB_RECONNECT_DELAY_MS: Min. time in ms between connection attempts. > O2CB_RECONNECT_DELAY_MS=5000 > ----------------------------------------------------------------------- > We tried to change many times the value but nothing to do. > > I think the most easy way is stop heartbeat, but we can success to do it. > > HELP ME > > > > > > > > > -----Sunil Mushran <Sunil.Mushran@oracle.com> wrote: ----- > > To: g.digiambelardini@fabaris.it > From: Sunil Mushran <Sunil.Mushran@oracle.com> > Date: 05/03/2008 18.55 > cc: ocfs-users@oss.oracle.com > Subject: Re: [Ocfs-users] cluster with 2 nodes - heartbeat problemfencing> >http://oss.oracle.com/projects/ocfs2/dist/documentation/ocfs2_faq.html#QUORUM> > > g.digiambelardini@fabaris.it wrote: > >> Hi, >> now the problem is different, >> this is My cluster.conf: >> >> ---------------------------------------------------------- >> node: >> ip_port = 7777 >> ip_address = 1.1.1.1 >> number = 0 >> name = virtual1 >> cluster = ocfs2 >> >> node: >> ip_port = 7777 >> ip_address = 1.1.1.2 >> number = 1 >> name = virtual2 >> cluster = ocfs2 >> >> cluster: >> node_count = 2 >> name = ocfs2 >> ----------------------------------------------------- >> now seems the one of the cluster is a master, or better the virtual1 isa>> master, so when we shutdown the heartbeat interface ( eth0 - with >> > partition > >> mounted ) on the virtual1, the virtual2 gone in kernel panic. Instead if >> > we > >> shutdown the eth0 on virtual2, virtual1 work well. >> some body can help us? >> obviously if we reboot any server, so the partition gone unmountedbefore>> network gone down, avery thing work well. >> THANKS >> >> >> >> >> -----ocfs-users-bounces@oss.oracle.com wrote: ----- >> >> To: ocfs-users@oss.oracle.com >> From: g.digiambelardini@fabaris.it >> Sent by: ocfs-users-bounces@oss.oracle.com >> Date: 05/03/2008 13.51 >> Subject: [Ocfs-users] cluster with 2 nodes - heartbeat problem fencing >> >> >> >> Hi to all, this is My first time on this mailinglist. >> I have a problem with Ocfs2 on Debian etch 4.0 >> I'd like when a node go down or freeze without unmount the ocfs2 >> > partition > >> the heartbeat not fence the server that work well ( kernel panic ). >> I'd like disable or heartbeat or fencing. So we can work also with only1>> node. >> Thanks >> >> >> _______________________________________________ >> Ocfs-users mailing list >> Ocfs-users@oss.oracle.com >> http://oss.oracle.com/mailman/listinfo/ocfs-users >> >> >> _______________________________________________ >> Ocfs-users mailing list >> Ocfs-users@oss.oracle.com >> http://oss.oracle.com/mailman/listinfo/ocfs-users >> >> > > >