Oleg Semyonov
2013-Jan-29 11:05 UTC
[Nut-upsuser] APC SmartUPS 1500 (USB) does not report OL/OB state
Hi, I have the following system configuration (probably does not matter, but anyway): Ubuntu Linux x86 VM running under ESXi bare metal hypervisor. UPS is connected to the ESXi host, and VM communicates to the UPS using USB pass through option. Details of elaborated shutdown sequence are out of topic, so I skip them. NUT was installed from the package, I believe it is recent enough. NUT was set up with default options. All worked but sometimes I received a lot of log messages with USB timeouts. It might work 2 hours or 2 days w/o them, and then a lot of such errors. NUT restart helped, so they were not a VM/host hardware problems. As suggested everywhere, I tried to change timeouts. Pollfreq made no changes. But changing the driver pollinterval to 10 (instead of default 2) has helped. No more timeouts were received. But suddenly I found that now NUT does not see OnBattery events at all. If I pull the power cord from UPS, it immediately reports input voltage = 0, but still "is" OnLine. I found that very weird but started to play with timeouts again. I tried different values as well as set the pollonly flag - no luck. Setting any pollinterval value above default 2 resulted in missed power state change reports (but still showing UPS data values). Yes, I see using upsc that UPS is OL (or OL CHRG if it was charging) and input voltage 0V. With pollinterval=4 it reported battery state *sometimes* but could miss the opposite transition, etc. Was unreliable, in short. As a last resort, I set pollinerval=0, and wow, it now reports power state transitions almost instantly and works reliable. Can't say if it will give me timeouts (24 hours, it still works, but see above, it means nothing yet). But CPU consumption increased significantly. Before the change this VM consumed around 7MHz CPU share when idle (using VMware monitor), now it consumes 100-160MHz (and 1% using top inside of VM). The usbhid-ups driver is either in S state (sleep) or mostly in D (uninterruptible sleep). 1498 nut 20 0 2636 588 360 D %CPU=2 0.1 16:05.10 usbhid-ups My guess was that similarly to setting "pollonly" flag which deals with broken HID interrupts, setting pollinterval=0 means "interrupt only" mode (to deal with broken polls which don't report proper power state OL/OB). Looking into the source I found that zero value is not supported at all, and it basically means infinite select wait time (because you subtract at least 1 second from unsigned time_t=0 if pollinterval=0). So my questions are: 1) Any suggestion to fix or debug the problem with improper UPS state reporting? I remind that it gives timeouts with default pollinterval values. With values above default 2 seconds it reports voltages etc properly, but does not report (or report unreliable) UPS states (OL/OB), keeping previous states. 2) Probably the code could (should?) be rewritten to support "no poll" (interrupt only). Now pollinterval=0 works best for me, but I see it is not by design, but due to some luck only. And instead of "no polls" it means "wait indefinitely", but it consumes CPU. Oleg
Reasonably Related Threads
- APC Smart-UPS X 1500 Restart Issue.
- APC Smart-UPS X 1500 Restart Issue.
- [Nut-upsuser] Brand new EATON 3S700DIN (mfr.date 09/28/14) doesn't wait for LB flag
- Brand new EATON 3S700DIN (mfr.date 09/28/14) doesn't wait for LB flag
- snmp-ups sends status "OL OB" on HP R3000 UPS with AF465A management card [UPDATE]