Hi Guys, I am setting up a heartbeat cluster for my 2 MDS servers. However, I am running into the following issue. If I power off the passive node and heartbeat uncleanly shuts down, then after the server is brought back online and the heartbeat services are started, all my resource are shutdown eventhough they are running on the active node and then brought back online automatically. Am I missing some settings here? Stickiness? I have been unable to get this to work. Also do I need to disable the lvm2-monitor service on my MDS''s? Any assistance would be greatly appreciated. Thanks in advance, -J -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100119/d007bb7d/attachment.html
If it is 2 node only heartbeat cluster (MDS pair) I would recommend configuring HA-linux hearbeat to work in v1 mode (CRM off). Configuration is much simpler and it is easy to operate. 2010/1/19 Jagga Soorma <jagga13 at gmail.com>:> Hi Guys, > > I am setting up a heartbeat cluster for my 2 MDS servers.? However, I am > running into the following issue.? If I power off the passive node and > heartbeat uncleanly shuts down, then after the server is brought back online > and the heartbeat services are started, all my resource are shutdown > eventhough they are running on the active node and then brought back online > automatically.? Am I missing some settings here?? Stickiness?? I have been > unable to get this to work. > > Also do I need to disable the lvm2-monitor service on my MDS''s?? Any > assistance would be greatly appreciated. > > Thanks in advance, > -J > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >-- -- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge Email: wjt27 at cam.ac.uk Tel: (+)44 1223 763517
Jagga Soorma wrote:> Hi Guys, > > I am setting up a heartbeat cluster for my 2 MDS servers. However, I > am running into the following issue. If I power off the passive node > and heartbeat uncleanly shuts down, then after the server is brought > back online and the heartbeat services are started, all my resource > are shutdown eventhough they are running on the active node and then > brought back online automatically. Am I missing some settings here? > Stickiness? I have been unable to get this to work.Without logs its hard to say, but it sounds like it may be a resource-stickiness issue. Setting a default resource stickiness of something high like 1000 or 2000 will usually keep resources stuck to a node until you tell it to move (with a higher score/INFINITY). Also, make sure no services that heartbeat manages are started at boot. This includes making sure your MDS and OSS filesystems are not in /etc/fstab. Good luck, -- : Adam Gandelman : LINBIT | Your Way to High Availability : Sales: 1-877-4-LINBIT / 1-877-454-6248 : : 7959 SW Cirrus Dr. : Beaverton, OR 97008 : : http://www.linbit.com