Jim Klimov
2025-Jun-30 18:18 UTC
[Nut-upsuser] Alert: REPLBATT active after battery replacement and requires reboot to clear
Hello, You mention that you've tried restarting the "nut-server" - I suppose you mean literally, the service unit by such name - of the NUT data server. Did you try restarting the unit for the NUT driver (e.g. `systemctl restart nut-drvier at upsname` with NUT v2.8.x and newer)? You did not mention the driver used, but I wonder if that driver program "latches" the RB value when it goes bad and never updates it?.. This could make sense when UPS battery replacement means server downtime, but that is just a subset of real-life cases - so generally can be just an oversight. For example, `bcmxcp` code seems to only set `bcmxcp_status.alarm_replace_battery=1` (oddly neither the field nor struct is ever initialized to 0, so might be garbage on some systems/compilers that do not zero-out aggregate types by default). Jim On Mon, Jun 30, 2025 at 7:53?PM Vyasa via Nut-upsuser < nut-upsuser at alioth-lists.debian.net> wrote:> Hello, > > CONFIGURATION: > > I am using a Powerware PW9120 3000i, on a network configuration with a > server and a couple of slaves. > > The nut-server OS is *Debian 12 (6.1.0-37-amd64)*. Nut was installed > from the Debian repo with version *2.8.0-7 amd64*, and client has the > same version. > > UPS is connected with a standard RS232 serial connection, and works with > all standard commands and functionality. > > Command "*upscmd -l upsname*" provides the following, where I have > successfully used *test.battery.start* and *test.system.start*: > > beeper.disable - Disable the UPS beeper > beeper.enable - Enable the UPS beeper > beeper.mute - Temporarily mute the UPS beeper > load.on - Turn on the load immediately > outlet.1.load.off - Turn off the load on outlet 1 immediately > outlet.1.load.on - Turn on the load on outlet 1 immediately > outlet.1.shutdown.return - Turn off the outlet 1 and return when power is > back > outlet.2.load.off - Turn off the load on outlet 2 immediately > outlet.2.load.on - Turn on the load on outlet 2 immediately > outlet.2.shutdown.return - Turn off the outlet 2 and return when power is > back > shutdown.return - Turn off the load and return when power is back > shutdown.stayoff - Turn off the load and remain off > test.battery.start - Start a battery test > test.system.start - Start a system test > > ISSUE: > > Every couple of years when I have to replace batteries in the UPS, I get > an issue with not being able to clear the REPLBATT alert. That is not > until I reboot the server running NUT-SERVER. This might seem as not a big > deal, but becomes a hassle when batteries haven't quite failed yet and are > still good after a ups battery test. > > The UPS itself reports OK after battery replacement or battery test, and > clears alarm on its LCD. But when I poll the UPS data using "upsc upsname" > I still see the RB or REPLBATT and this will not clear until I reboot the > server. So without reboot the alert will then be generated based on > RBWARNTIME in upsmon.conf, which is as per nut design. > > So without reboot I always get the RB flag with status: > *Alert type: REPLBATT* > *............* > *ups.status: OL RB* > *ups.test.result: Done and passed* > > After reboot of server the alert is cleared: > > > > *Alert type: COMMOK ............ ups.status: OL ups.test.result: Done and > passed* > > So my question becomes, why is this reboot required and it doesn't seem to > make any sense? I can't understand why the polled data from a UPS would > change after a reboot, while on the UPS LCD its reporting all OK? I tried > restarting NUT-SERVER to see if it would make any difference. Also, the > command test.battery.start will clear the alarm on the UPS if battery test > good. > > The only explanation that I have come up with is that the persistent > RB/REPLBATT is latched to this condition and is an artifact of UPS to NUT > handshaking. > > Any feedback would be kindly appreciated, as I have searched and searched. > > Thank you! > Vyasa > _______________________________________________ > Nut-upsuser mailing list > Nut-upsuser at alioth-lists.debian.net > https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://alioth-lists.debian.net/pipermail/nut-upsuser/attachments/20250630/04f6085f/attachment.htm>
Vyasa
2025-Jun-30 22:35 UTC
[Nut-upsuser] Alert: REPLBATT active after battery replacement and requires reboot to clear
Hi Jim, Thanks for the prompt response. The restart I refer to was exactly as you say.? Where I restarted the service using: systemctl restart nut-server.? This was separate to where I mention the reboot of server machine, which resolves the issue. The driver used was: Network UPS Tools - UPS driver controller 2.8.0 Network UPS Tools - BCMXCP UPS driver 0.32 (2.8.0) I simulated the fault again, by putting the UPS in bypass and disconnecting the battery.? This caused the RB alert again.? With this I then reconnected battery, restored UPS to normal operating condition.? Then used upsdrvctl to STOP and START the driver. Generating alert condition for simulating RB: Alert type: REPLBATT ..................... ups.status: ALARM OL BYPASS RB ups.test.result: Done and error Alert cleared on UPS, and alert condition with RB persisting on NUT-SERVER: Alert type: ONLINE ................. ups.status: OL RB ups.test.result: Done and passed Restarting using upsdrvctl start/stop command clears RB: Alert type: COMMOK .................. ups.status: OL ups.test.result: Done and passed So it seems that your and my suspicions have been verified. Where bcmxcp seems to "latch" the alarm until driver restart or server reboot. I think you are correct, in that this can cause issues in other subsets of real-life cases.? Thinking here of automating and scripting and so forth. What would you suggest at this point?? Can this be submitted as a bug? Vyasa On 6/30/25 14:18, Jim Klimov wrote:> Hello, > > ? You mention that you've tried restarting the "nut-server" - I > suppose you mean literally, the service unit by such name - of the NUT > data server. Did you try restarting the unit for the NUT driver (e.g. > `systemctl restart nut-drvier at upsname` with NUT v2.8.x and newer)? > > ? You did?not mention the driver used, but I wonder if that driver > program "latches" the RB value when it goes bad and never updates > it?.. This could make sense when UPS battery replacement means server > downtime, but that is just a subset of real-life cases - so generally > can be just an oversight. For example, `bcmxcp` code seems to only set > `bcmxcp_status.alarm_replace_battery=1` (oddly neither the field nor > struct is ever initialized to 0, so might be garbage on some > systems/compilers that?do not zero-out aggregate types by default). > > Jim > > > On Mon, Jun 30, 2025 at 7:53?PM Vyasa via Nut-upsuser > <nut-upsuser at alioth-lists.debian.net> wrote: > > Hello, > > CONFIGURATION: > > I am using a Powerware PW9120 3000i, on a network configuration > with a server and a couple of slaves. > > The nut-server OS is /Debian 12 (6.1.0-37-amd64)/. Nut was > installed from the Debian repo with version /2.8.0-7 amd64/, and > client has the same version. > > UPS is connected with a standard RS232 serial connection, and > works with all standard commands and functionality. > > Command "/upscmd -l upsname/" provides the following, where I have > successfully used /test.battery.start/ and /test.system.start/: > > beeper.disable - Disable the UPS beeper > beeper.enable - Enable the UPS beeper > beeper.mute - Temporarily mute the UPS beeper > load.on - Turn on the load immediately > outlet.1.load.off - Turn off the load on outlet 1 immediately > outlet.1.load.on - Turn on the load on outlet 1 immediately > outlet.1.shutdown.return - Turn off the outlet 1 and return when > power is back > outlet.2.load.off - Turn off the load on outlet 2 immediately > outlet.2.load.on - Turn on the load on outlet 2 immediately > outlet.2.shutdown.return - Turn off the outlet 2 and return when > power is back > shutdown.return - Turn off the load and return when power is back > shutdown.stayoff - Turn off the load and remain off > test.battery.start - Start a battery test > test.system.start - Start a system test > > ISSUE: > > Every couple of years when I have to replace batteries in the UPS, > I get an issue with not being able to clear the REPLBATT alert.? > That is not until I reboot the server running NUT-SERVER.? This > might seem as not a big deal, but becomes a hassle when batteries > haven't quite failed yet and are still good after a ups battery test. > > The UPS itself reports OK after battery replacement or battery > test, and clears alarm on its LCD.? But when I poll the UPS data > using "upsc upsname" I still see the RB or REPLBATT and this will > not clear until I reboot the server.? So without reboot the alert > will then be generated based on RBWARNTIME in upsmon.conf, which > is as per nut design. > > So without reboot I always get the RB flag with status: > > /Alert type: REPLBATT/ > /............/ > /ups.status: OL RB/ > /ups.test.result: Done and passed/ > > After reboot of server the alert is cleared: > > /Alert type: COMMOK > ............ > ups.status: OL > ups.test.result: Done and passed/ > > So my question becomes, why is this reboot required and it doesn't > seem to make any sense?? I can't understand why the polled data > from a UPS would change after a reboot, while on the UPS LCD its > reporting all OK?? I tried restarting NUT-SERVER to see if it > would make any difference.? Also, the command test.battery.start > will clear the alarm on the UPS if battery test good. > > The only explanation that I have come up with is that the > persistent RB/REPLBATT is latched to this condition and is an > artifact of UPS to NUT handshaking. > > Any feedback would be kindly appreciated, as I have searched and > searched. > > Thank you! > > Vyasa > _______________________________________________ > Nut-upsuser mailing list > Nut-upsuser at alioth-lists.debian.net > https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://alioth-lists.debian.net/pipermail/nut-upsuser/attachments/20250630/5b65ec53/attachment-0001.htm>
Jim Klimov
2025-Jul-01 07:23 UTC
[Nut-upsuser] Alert: REPLBATT active after battery replacement and requires reboot to clear
I think yes, seems like a valid bug. Also as you mention `upsdrvctl`, systemd and NUT v2.8.x together, take a look at https://github.com/networkupstools/nut/wiki/nut%E2%80%90driver%E2%80%90enumerator-(NDE) - it may be more applicable to use `upsdrvsvcctl` instead nowadays. Jim On Tue, Jul 1, 2025 at 12:35?AM Vyasa <info at dalpha.com> wrote:> Hi Jim, > > Thanks for the prompt response. > > The restart I refer to was exactly as you say. Where I restarted the > service using: systemctl restart nut-server. This was separate to where I > mention the reboot of server machine, which resolves the issue. > > The driver used was: > Network UPS Tools - UPS driver controller 2.8.0 > Network UPS Tools - BCMXCP UPS driver 0.32 (2.8.0) > > I simulated the fault again, by putting the UPS in bypass and > disconnecting the battery. This caused the RB alert again. With this I > then reconnected battery, restored UPS to normal operating condition. Then > used upsdrvctl to STOP and START the driver. > Generating alert condition for simulating RB: > Alert type: REPLBATT > ..................... > ups.status: ALARM OL BYPASS RB > ups.test.result: Done and error > > Alert cleared on UPS, and alert condition with RB persisting on NUT-SERVER: > Alert type: ONLINE > ................. > ups.status: OL RB > > ups.test.result: Done and passed > Restarting using upsdrvctl start/stop command clears RB: > Alert type: COMMOK > .................. > ups.status: OL > ups.test.result: Done and passed > > So it seems that your and my suspicions have been verified. Where bcmxcp > seems to "latch" the alarm until driver restart or server reboot. > > I think you are correct, in that this can cause issues in other subsets of > real-life cases. Thinking here of automating and scripting and so forth. > > What would you suggest at this point? Can this be submitted as a bug? > > Vyasa > > > > On 6/30/25 14:18, Jim Klimov wrote: > > Hello, > > You mention that you've tried restarting the "nut-server" - I suppose > you mean literally, the service unit by such name - of the NUT data server. > Did you try restarting the unit for the NUT driver (e.g. `systemctl restart > nut-drvier at upsname` with NUT v2.8.x and newer)? > > You did not mention the driver used, but I wonder if that driver program > "latches" the RB value when it goes bad and never updates it?.. This could > make sense when UPS battery replacement means server downtime, but that is > just a subset of real-life cases - so generally can be just an oversight. > For example, `bcmxcp` code seems to only set > `bcmxcp_status.alarm_replace_battery=1` (oddly neither the field nor struct > is ever initialized to 0, so might be garbage on some systems/compilers > that do not zero-out aggregate types by default). > > Jim > > > On Mon, Jun 30, 2025 at 7:53?PM Vyasa via Nut-upsuser < > nut-upsuser at alioth-lists.debian.net> wrote: > >> Hello, >> >> CONFIGURATION: >> >> I am using a Powerware PW9120 3000i, on a network configuration with a >> server and a couple of slaves. >> >> The nut-server OS is *Debian 12 (6.1.0-37-amd64)*. Nut was installed >> from the Debian repo with version *2.8.0-7 amd64*, and client has the >> same version. >> >> UPS is connected with a standard RS232 serial connection, and works with >> all standard commands and functionality. >> >> Command "*upscmd -l upsname*" provides the following, where I have >> successfully used *test.battery.start* and *test.system.start*: >> >> beeper.disable - Disable the UPS beeper >> beeper.enable - Enable the UPS beeper >> beeper.mute - Temporarily mute the UPS beeper >> load.on - Turn on the load immediately >> outlet.1.load.off - Turn off the load on outlet 1 immediately >> outlet.1.load.on - Turn on the load on outlet 1 immediately >> outlet.1.shutdown.return - Turn off the outlet 1 and return when power is >> back >> outlet.2.load.off - Turn off the load on outlet 2 immediately >> outlet.2.load.on - Turn on the load on outlet 2 immediately >> outlet.2.shutdown.return - Turn off the outlet 2 and return when power is >> back >> shutdown.return - Turn off the load and return when power is back >> shutdown.stayoff - Turn off the load and remain off >> test.battery.start - Start a battery test >> test.system.start - Start a system test >> >> ISSUE: >> >> Every couple of years when I have to replace batteries in the UPS, I get >> an issue with not being able to clear the REPLBATT alert. That is not >> until I reboot the server running NUT-SERVER. This might seem as not a big >> deal, but becomes a hassle when batteries haven't quite failed yet and are >> still good after a ups battery test. >> >> The UPS itself reports OK after battery replacement or battery test, and >> clears alarm on its LCD. But when I poll the UPS data using "upsc upsname" >> I still see the RB or REPLBATT and this will not clear until I reboot the >> server. So without reboot the alert will then be generated based on >> RBWARNTIME in upsmon.conf, which is as per nut design. >> >> So without reboot I always get the RB flag with status: >> *Alert type: REPLBATT* >> *............* >> *ups.status: OL RB* >> *ups.test.result: Done and passed* >> >> After reboot of server the alert is cleared: >> >> >> >> *Alert type: COMMOK ............ ups.status: OL ups.test.result: Done and >> passed* >> >> So my question becomes, why is this reboot required and it doesn't seem >> to make any sense? I can't understand why the polled data from a UPS would >> change after a reboot, while on the UPS LCD its reporting all OK? I tried >> restarting NUT-SERVER to see if it would make any difference. Also, the >> command test.battery.start will clear the alarm on the UPS if battery test >> good. >> >> The only explanation that I have come up with is that the persistent >> RB/REPLBATT is latched to this condition and is an artifact of UPS to NUT >> handshaking. >> >> Any feedback would be kindly appreciated, as I have searched and searched. >> >> Thank you! >> Vyasa >> _______________________________________________ >> Nut-upsuser mailing list >> Nut-upsuser at alioth-lists.debian.net >> https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://alioth-lists.debian.net/pipermail/nut-upsuser/attachments/20250701/f6679018/attachment.htm>