Hi, I am running two node active/passive cluster running Centos3 update 8 64 bit on Hp Box with external hp storage connected via scsi. My cluster was running fine for last 3 years.But all of a sudden cluster service keep on shifting (atleast one time in a day )form one node to another. After analysed the syslog i found that due to some network fluctuation service was getting shifted.Both the nodes has two NIC bonded together and configured with below ip. My network details: 192.168.1.2 --node 1 physical ip with class c subnet (bond0 ) 192.168.1.3 --node 2 physical ip with class c subnet (bond0 ) 192.168.1.4 --- floating ip ( cluster ) Since it is a very critical and busy server may be due to heavy network load some hear beat signal is getting missed resulting in shifting of service from one node to another. So i planned to connect crossover cable for heart beat messages, can any one guide me or provide me the link that best explains how to do the same and the changes i have to made in cluster configuration file after connecting the crossover cable. Regards, Lingu
Morten Torstensen
2008-Nov-06 12:37 UTC
[CentOS] Cluster Heart Beat Using Cross Over Cable
lingu wrote:> Since it is a very critical and busy server may be due to heavy > network load some hear beat signal is getting missed resulting in > shifting of service from one node to another.For automated takeover systems, especially critical ones (tho you can argue that any system setup with automatic takeover is critical by definition), you should have multiple heartbeat paths. Ethernet, serial cable, on shared disk, fibre or whatnot. Having false takeovers due to missed heartbeat on one set of ethernet cards could also likely be missed on another set of cards, even with a crossover cable. Maybe you should investigate alternate paths? -- //Morten Torstensen //Email: morten at mortent.org //IM: morten.torstensen at gmail.com I can't listen to that much Wagner. I start getting the urge to conquer Poland. -- Woody Allen
Flaherty, Patrick
2008-Nov-06 17:12 UTC
[CentOS] Cluster Heart Beat Using Cross Over Cable
> I am running two node active/passive cluster running Centos3 update > 8 64 bit on Hp Box with external hp storage connected via scsi. My > cluster was running fine for last 3 years.But all of a sudden cluster > service keep on shifting (atleast one time in a day )form one node to > another. > > After analysed the syslog i found that due to some network > fluctuation service was getting shifted.Both the nodes has two NIC > bonded together and configured with below ip. > > My network details: > > 192.168.1.2 --node 1 physical ip with class c subnet (bond0 ) > 192.168.1.3 --node 2 physical ip with class c subnet (bond0 ) > 192.168.1.4 --- floating ip ( cluster ) > > Since it is a very critical and busy server may be due to heavy > network load some hear beat signal is getting missed resulting in > shifting of service from one node to another. > > So i planned to connect crossover cable for heart beat messages, can > any one guide me or provide me the link that best explains how to do > the same and the changes i have to made in cluster configuration file > after connecting the crossover cable.Hi Lingu, I realize you're just trying to get this fixed, but what happened on your network to make an ha pair that has been stable for three years start getting flakey? Everything I know about heartbeat comes from http://www.linux-ha.org/ha.cf. I'm fairly sure all you need to add to your config is a bcast line. In a simple setup, "bcast eth0 eth1" will send heartbeats over eth0 and eth1, you will probably want "bcast bond0 eth3" depending on your interface names. Make sure you are also pinging a third host (the subnet's gateway is a good choice). There was a heartbeat security issue a couple years ago, you should consider planning to upgrade to a patched version. Patrick