Manuel Wolfshant
2017-Feb-13 14:53 UTC
[Nut-upsuser] NUT configuration complicated by Stonith/Fencing cabling
On 02/13/2017 04:39 PM, Charles Lepple wrote:> On Feb 13, 2017, at 8:08 AM, Tim Richards <tims_tank at hotmail.com> wrote: >> Feb 13 23:11:42 systemd[1] Starting LSB: UPS monitoring software (deprecated, remote/local)... >> Feb 13 23:11:43 usbhid-ups[2093] Startup successful >> Feb 13 23:11:43 upsd[1 932] Starting NUT UPS drivers ..done >> Feb 13 23:11:43 upsd[21 04] not listening on 192.168.1.22 port 3.493 >> Feb 13 23:11:43 upsd[21 04] listening on ::1 port 3493 >> Feb 13 23:11:43 upsd[2104] listening on 127.0.0.1 port 3493 >> Feb 1323:11:43 upsd[21041 no listening interface available > It looks like you have a "LISTEN 192.168.1.22:3493" line in upsd.conf (in addition to the ones for the IPv4/IPv6 loopback addresses). This worked fine back when the init system actually finished bringing up all of the network interfaces before attempting to start NUT. However, it seems that when moving to systemd, many distributions are depending on some generic multi-user target, rather than the completion of the networking setup. The log message "not listening on" indicates that upsd tried and failed to bind to that listening address. > > I made a note to document this: https://github.com/networkupstools/nut/issues/393 > > I don't have a scratch system running systemd, so I don't have any recommendations on adjusting the dependencies (to wait until networking is really up). There have been some related discussions on other GitHub issues and pull requests: https://github.com/networkupstools/nut/labels/systemdhttps://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ http://stackoverflow.com/questions/32873571/debian-systemd-service-starts-before-network-is-ready
Tim Richards
2017-Feb-14 00:08 UTC
[Nut-upsuser] NUT configuration complicated by Stonith/Fencing cabling
Charles and Manuel, Thanks for the help. Charles pointer to the IP address that was 'not listening' gave me the hint. I had assigned the listening interface to the crossover cable connected network cards link between the two nodes. I changed it to the switch connected network cards and bingo. The lack of power in the other node was telling the surviving node on reboot that the crossover connected network card was dead, hence no NUT. Both nodes still go down on pulling one plug, but the mains connected node comes back up with services working. And when I restart the other node, they still try to shoot (fence) each other. But that's fine, someone will be there checking why only one UPS has died. That's enough high availability for this use. Hope this configuration is useful to someone else. Regards, Tim. -----Original Message----- From: Nut-upsuser [mailto:nut-upsuser-bounces+tims_tank=hotmail.com at lists.alioth.debian.org] On Behalf Of Manuel Wolfshant Sent: Tuesday, 14 February 2017 1:53 AM To: nut-upsuser Mailing List Subject: Re: [Nut-upsuser] NUT configuration complicated by Stonith/Fencing cabling On 02/13/2017 04:39 PM, Charles Lepple wrote:> On Feb 13, 2017, at 8:08 AM, Tim Richards <tims_tank at hotmail.com> wrote: >> Feb 13 23:11:42 systemd[1] Starting LSB: UPS monitoring software (deprecated, remote/local)... >> Feb 13 23:11:43 usbhid-ups[2093] Startup successful Feb 13 23:11:43 >> upsd[1 932] Starting NUT UPS drivers ..done Feb 13 23:11:43 upsd[21 >> 04] not listening on 192.168.1.22 port 3.493 Feb 13 23:11:43 upsd[21 >> 04] listening on ::1 port 3493 Feb 13 23:11:43 upsd[2104] listening >> on 127.0.0.1 port 3493 Feb 1323:11:43 upsd[21041 no listening >> interface available > It looks like you have a "LISTEN 192.168.1.22:3493" line in upsd.conf (in addition to the ones for the IPv4/IPv6 loopback addresses). This worked fine back when the init system actually finished bringing up all of the network interfaces before attempting to start NUT. However, it seems that when moving to systemd, many distributions are depending on some generic multi-user target, rather than the completion of the networking setup. The log message "not listening on" indicates that upsd tried and failed to bind to that listening address. > > I made a note to document this: > https://github.com/networkupstools/nut/issues/393 > > I don't have a scratch system running systemd, so I don't have any > recommendations on adjusting the dependencies (to wait until > networking is really up). There have been some related discussions on > other GitHub issues and pull requests: > https://github.com/networkupstools/nut/labels/systemdhttps://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ http://stackoverflow.com/questions/32873571/debian-systemd-service-starts-before-network-is-ready _______________________________________________ Nut-upsuser mailing list Nut-upsuser at lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/nut-upsuser
Tim Richards
2017-Feb-15 23:57 UTC
[Nut-upsuser] NUT configuration complicated by Stonith/Fencing cabling
List, In the interest of completeness, I emailed the author of the NUT fencing agent and asked him about his setup. His UPSes were all networked, so my USB "cross connected" use, while working, is probably beyond any design specs. His reply is quoted below. Tim. Hi Tim, I'm afraid I'm going to disappoint you. We gave up on HA clusters at our site. The details of why I gave it up are here: <http://www.gossamer-threads.com/lists/linuxha/users/87132> I can answer some of your questions, though: - All of our UPSes have network cards. - That NUT fencing agent script I wrote was not very good at fencing with networked APC UPSes. The problem is that there's a couple of seconds delay between the UPS changing status and the network interface reporting it correctly. It's possible for you to issue the network command to turn power on to the UPS, only to have it report "OFF" if you query it again too quickly. - Because of the networking, every system in the cluster can query every UPS. This is important, because if a UPS is supplying power to a switch that connects a system (say webserver2) to its UPS (webserver2-ups), then you want to make sure that switch is powered by yet another UPS (e.g., switch-ups) so that fencing doesn't block communications to the UPS. In the event of a power outage, you want the systems to shutdown cleanly if the battery in webserver2-ups OR switch-ups is running out. - To restart a system on my cluster, I issued the network commands to the UPS on the STONITHed machine. - I let the UPS's own "BATTERY LOW" signal tell me when to shut down a system. However, I adjusted the parameters to do this at least five minutes before the battery ran out. This required annual full calibration of each UPS, to make sure I knew that "five minute" estimate was reliable. - If you want the full gory details, which I think may not be relevant to you, you can read my 2013 description of my setup: <https://twiki.nevis.columbia.edu/twiki/bin/view/Main/PacemakerDualPrimaryConfiguration> Good luck!
Reasonably Related Threads
- NUT configuration complicated by Stonith/Fencing cabling
- NUT configuration complicated by Stonith/Fencing cabling
- NUT configuration complicated by Stonith/Fencing cabling
- NUT configuration complicated by Stonith/Fencing cabling
- NUT configuration complicated by Stonith/Fencing cabling