hi: when testing nut in our environment, we found something that nut maybe tune for "stale/dead ups" situation. currently the "dead ups" are assume alive(eg: host shutdown unnecessary), unless it is in the "OB" state before going to stale. our environment (ServerA + ServerB forms a cluster): ServerA-> usb to UPSA -> two PS power by UPSA and UPSB -> upsmon monitor two UPS ServerB-> usb to UPSB -> two PS power by UPSA and UPSB -> upsmon monitor two UPS now if ServerB crash and then power's been cut, ServerA won't shutdown since UPSB is stale/dead. in Server A situation, I want to assume UPSA is "alive/no need shutdown" when it is stale, but I want to assume UPSB is "dead/need shutdown" when it is stale. to accomplish that, maybe uspmon can have a flag to indicate that: MONITOR ftups at localhost 1 monmaster passmaster master alive MONITOR ftups at 10.1.1.1 1 monslave nutslave slave dead maybe there are other situations that people want to declare "stale" ups as "dead/need shutdown".. Regards, tbskyd
On Sep 14, 2015, at 1:36 AM, d tbsky <tbskyd at gmail.com> wrote:> > hi: > when testing nut in our environment, we found something that nut > maybe tune for "stale/dead ups" situation. currently the "dead ups" > are assume alive(eg: host shutdown unnecessary), unless it is in the > "OB" state before going to stale.Correct. "data stale" is meant to be an indication that something is wrong with the UPS or driver, rather than as a definitive power state. Ideally, upsmon sees a transition from "OL" to "OB" to "OB LB" without any intervening "data stale" errors. I would recommend investigating ways around the "data stale" error, rather than investing time in logic that is only meant to be a last resort. The driver man pages mention a few cases where the NUT default timeouts might not be appropriate, and can be adjusted to avoid the "data stale" condition. -- Charles Lepple clepple at gmail
[forwarding to the list] On Sep 14, 2015, at 10:39 PM, d tbsky <tbskyd at gmail.com> wrote:> > 2015-09-14 20:34 GMT+08:00 Charles Lepple <clepple at gmail.com>: >> I would recommend investigating ways around the "data stale" error, rather than investing time in logic that is only meant to be a last resort. The driver man pages mention a few cases where the NUT default timeouts might not be appropriate, and can be adjusted to avoid the "data stale" condition. > > if the host which attached to UPS crashed, there is no way to > prevent "data stale". maybe I can write script triggered by ups > event. but that seems a little stupid since I already have "upsmon" to > do most of the work. now I can only assume my environment won't have a > "double fault": when one of the cluster server crashed or one of the > power supply die, I won't get a power cut simultaneously... > > Regards, > tbskyd