Tim Richards
2017-Feb-13 13:08 UTC
[Nut-upsuser] NUT configuration complicated by Stonith/Fencing cabling
Charles, Thanks for your reply. Indeed you may be right that the NUT fencing agent might be written with networked UPSes in mind, as healthy nodes could use the network to issue "fence" orders to remove unhealthy ones. I will post here if I find more info. The problem with the resupply of services is that NUT doesn't restart on the node that comes back up. To recap, I pull the power on one UPS, both nodes shutdown. The remaining mains connected UPS power cycles its outlets, which reboots its node. Because the node has just started, it wants all of its services to be healthy before providing them. This includes the fencing agent, which relies on NUT, which hasn't started. So the node doesn't start the rest of its services (Apache, MySQL, Samba). Relevant log entries. Feb 13 23:11:42 xinetd[1647] Reading included configuration file: /etc/xinetd.d/cups-lpd [file/etclxinetd.d/cups-lpd] [linel 7] Feb 13 23:11:42 systemd[1] Starting LSB: UPS monitoring software (deprecated, remote/local)... Feb 13 23:11:43 usbhid-ups[2093] Startup successful Feb 13 23:11:43 upsd[1 932] Starting NUT UPS drivers ..done Feb 13 23:11:43 upsd[21 04] not listening on 192.168.1.22 port 3.493 Feb 13 23:11:43 upsd[21 04] listening on ::1 port 3493 Feb 13 23:11:43 upsd[2104] listening on 127.0.0.1 port 3493 Feb 1323:11:43 upsd[21041 no listening interface available Feb 13 23:11:43 startproc[2095] startproc: exit status of parent of /usr/sbin/upsd: 1 Feb 13 23:11:43 usbhid-ups[20931 Signal 15: exiting Feb 1323:11:43 upsd[1932] Starting NUTUPSserver..failed Feb 13 23:11:43 systemd[1] upsd.service: Control process exited, codeexited status7 Feb 13 23:11:43 systemd[1] Failed to start LSB: UPS monitoring software (deprecated, remote/local). Feb 13 23:11:43 systemd[1] upsd.service: Unit entered failed state. Feb 13 23:11:43 systemd[1] upsd.service: Failed with result 'exit-code'. I can manually bring the surviving node's services back up if by removing the requirement that Stonith services are enabled. I cannot get NUT to restart until I restart the 2nd node. Regards, Tim. -----Original Message----- From: Charles Lepple [mailto:clepple at gmail.com] Sent: Sunday, 12 February 2017 11:57 AM To: Tim Richards Cc: nut-upsuser Mailing List Subject: Re: [Nut-upsuser] NUT configuration complicated by Stonith/Fencing cabling On Feb 10, 2017, at 5:48 PM, Tim Richards <tims_tank at hotmail.com> wrote:> > I am trying to kill two birds with one stone, that is UPS protection from power failure and cluster node fencing (Stonith) with the UPS ability to cut power to a node. Somebody has done this, as there exists a fencing agent using NUT in the Pacemaker/Corosync (Linux-HA cluster software), I just don't know the best way to go about it.Some UPS models have more than one serial port, or have a network adapter which can support multiple monitoring systems (via SNMP or HTTP/XML). Is it possible that the NUT fencing agent was written with that case in mind? That would mean that neither node would depend on the other for UPS status. Can you elaborate on the "resupply of services problem"? With cross-connected UPSes (and only a single comm port per UPS), I am not sure if you can achieve both goals when only one UPS loses power. (I don't think this sort of setup has been discussed much on the NUT lists, although it certainly sounds like an interesting way to use NUT. If you do find out more about how the NUT fencing agent was intended to be configured, perhaps from the fencing software lists or forums, feel free to post that here was well.) -- - Charles Lepple https://ghz.cc/charles/
Charles Lepple
2017-Feb-13 14:39 UTC
[Nut-upsuser] NUT configuration complicated by Stonith/Fencing cabling
On Feb 13, 2017, at 8:08 AM, Tim Richards <tims_tank at hotmail.com> wrote:> > Feb 13 23:11:42 systemd[1] Starting LSB: UPS monitoring software (deprecated, remote/local)... > Feb 13 23:11:43 usbhid-ups[2093] Startup successful > Feb 13 23:11:43 upsd[1 932] Starting NUT UPS drivers ..done > Feb 13 23:11:43 upsd[21 04] not listening on 192.168.1.22 port 3.493 > Feb 13 23:11:43 upsd[21 04] listening on ::1 port 3493 > Feb 13 23:11:43 upsd[2104] listening on 127.0.0.1 port 3493 > Feb 1323:11:43 upsd[21041 no listening interface availableIt looks like you have a "LISTEN 192.168.1.22:3493" line in upsd.conf (in addition to the ones for the IPv4/IPv6 loopback addresses). This worked fine back when the init system actually finished bringing up all of the network interfaces before attempting to start NUT. However, it seems that when moving to systemd, many distributions are depending on some generic multi-user target, rather than the completion of the networking setup. The log message "not listening on" indicates that upsd tried and failed to bind to that listening address. I made a note to document this: https://github.com/networkupstools/nut/issues/393 I don't have a scratch system running systemd, so I don't have any recommendations on adjusting the dependencies (to wait until networking is really up). There have been some related discussions on other GitHub issues and pull requests: https://github.com/networkupstools/nut/labels/systemd Another option might be to use "LISTEN 0.0.0.0:3493" and adjust the firewall rules to only allow packets destined for that interface. (One would hope that the firewall startup script is robust enough to handle this, but again, this is going to be highly dependent on the init system and its configuration.) Hopefully this gets you further in the startup process.
Manuel Wolfshant
2017-Feb-13 14:53 UTC
[Nut-upsuser] NUT configuration complicated by Stonith/Fencing cabling
On 02/13/2017 04:39 PM, Charles Lepple wrote:> On Feb 13, 2017, at 8:08 AM, Tim Richards <tims_tank at hotmail.com> wrote: >> Feb 13 23:11:42 systemd[1] Starting LSB: UPS monitoring software (deprecated, remote/local)... >> Feb 13 23:11:43 usbhid-ups[2093] Startup successful >> Feb 13 23:11:43 upsd[1 932] Starting NUT UPS drivers ..done >> Feb 13 23:11:43 upsd[21 04] not listening on 192.168.1.22 port 3.493 >> Feb 13 23:11:43 upsd[21 04] listening on ::1 port 3493 >> Feb 13 23:11:43 upsd[2104] listening on 127.0.0.1 port 3493 >> Feb 1323:11:43 upsd[21041 no listening interface available > It looks like you have a "LISTEN 192.168.1.22:3493" line in upsd.conf (in addition to the ones for the IPv4/IPv6 loopback addresses). This worked fine back when the init system actually finished bringing up all of the network interfaces before attempting to start NUT. However, it seems that when moving to systemd, many distributions are depending on some generic multi-user target, rather than the completion of the networking setup. The log message "not listening on" indicates that upsd tried and failed to bind to that listening address. > > I made a note to document this: https://github.com/networkupstools/nut/issues/393 > > I don't have a scratch system running systemd, so I don't have any recommendations on adjusting the dependencies (to wait until networking is really up). There have been some related discussions on other GitHub issues and pull requests: https://github.com/networkupstools/nut/labels/systemdhttps://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ http://stackoverflow.com/questions/32873571/debian-systemd-service-starts-before-network-is-ready
Seemingly Similar Threads
- NUT configuration complicated by Stonith/Fencing cabling
- NUT configuration complicated by Stonith/Fencing cabling
- NUT configuration complicated by Stonith/Fencing cabling
- NUT configuration complicated by Stonith/Fencing cabling
- NUT configuration complicated by Stonith/Fencing cabling