thr3ads.net - Nut upsuser - [Nut-upsuser] stale/dead ups logic [Sep 2015]

If this information is useful, please help other people find it:
Share via:

d tbsky

2015-Sep-14 05:36 UTC

[Nut-upsuser] stale/dead ups logic

hi:
    when testing nut in our environment, we found something that nut
maybe tune for "stale/dead ups" situation. currently the "dead
ups"
are assume alive(eg: host shutdown unnecessary), unless it is in the
"OB" state before going to stale.

   our environment (ServerA + ServerB forms a cluster):

  ServerA-> usb to UPSA -> two PS power by  UPSA and UPSB -> upsmon
monitor two UPS
  ServerB-> usb to UPSB -> two PS power by UPSA and UPSB -> upsmon
monitor two UPS

now if ServerB crash and then power's been cut,  ServerA won't
shutdown since UPSB is stale/dead.

in  Server A situation, I want to assume UPSA is "alive/no need
shutdown" when it is stale, but I want to assume UPSB is "dead/need
shutdown" when it is stale. to accomplish that, maybe uspmon can have
a flag to indicate that:

 MONITOR ftups at localhost 1 monmaster passmaster master alive
 MONITOR ftups at 10.1.1.1 1 monslave nutslave slave dead

 maybe there are other situations that people want to declare "stale"
ups as "dead/need shutdown"..

Regards,
tbskyd

Charles Lepple

2015-Sep-14 12:34 UTC

head link

[Nut-upsuser] stale/dead ups logic

On Sep 14, 2015, at 1:36 AM, d tbsky <tbskyd at gmail.com>
wrote:> 
> hi:
>    when testing nut in our environment, we found something that nut
> maybe tune for "stale/dead ups" situation. currently the
"dead ups"
> are assume alive(eg: host shutdown unnecessary), unless it is in the
> "OB" state before going to stale.
Correct. "data stale" is meant to be an indication that something is
wrong with the UPS or driver, rather than as a definitive power state. Ideally,
upsmon sees a transition from "OL" to "OB" to "OB
LB" without any intervening "data stale" errors.

I would recommend investigating ways around the "data stale" error,
rather than investing time in logic that is only meant to be a last resort. The
driver man pages mention a few cases where the NUT default timeouts might not be
appropriate, and can be adjusted to avoid the "data stale" condition.

-- 
Charles Lepple
clepple at gmail

Charles Lepple

2015-Sep-15 02:44 UTC

head link

[Nut-upsuser] stale/dead ups logic

[forwarding to the list]

On Sep 14, 2015, at 10:39 PM, d tbsky <tbskyd at gmail.com>
wrote:> 
> 2015-09-14 20:34 GMT+08:00 Charles Lepple <clepple at gmail.com>:
>> I would recommend investigating ways around the "data stale"
error, rather than investing time in logic that is only meant to be a last
resort. The driver man pages mention a few cases where the NUT default timeouts
might not be appropriate, and can be adjusted to avoid the "data
stale" condition.
> 
>   if the host which attached to UPS crashed, there is no way to
> prevent "data stale". maybe  I can write script triggered by ups
> event. but that seems a little stupid since I already have
"upsmon" to
> do most of the work. now I can only assume my environment won't have a
> "double fault": when one of the cluster server crashed or one of
the
> power supply die, I won't get a power cut simultaneously...
> 
> Regards,
> tbskyd

Apparently Analagous Threads

Search for more reasonably related threads

Nut upsuser - Sep 2015 - stale/dead ups logic

[Nut-upsuser] stale/dead ups logic

[Nut-upsuser] stale/dead ups logic

[Nut-upsuser] stale/dead ups logic

Apparently Analagous Threads