Shade Alabsa
2014-Nov-18 23:28 UTC
[Nut-upsuser] Tripp-Lite USB constantlly disconnecting.
All, We think we found something that is causing the problem but as of right now we are unsure as to why it might be causing the issue. Recently I had to restart testing this to figure it out and for some odd reason I wasn't able to reproduce it with any system but our own custom system. I then noticed that in ups.conf I did not set the pollinterval, we normally set it to 15, so I set it to test on a RHEL 6.6 machine which caused this issue to happen again. On our own custom system we took the poll interval out and prelim testing doesn't showing the repeated disconnects/reconnects. I started playing more with this pollinterval and noticed that up to 14, exclusive, it seems to be fine. At 14 it happens but the messages are really spread out, I don't have an exact time delta for this. Anything over 15 displays the same symptoms just faster, 15 is roughly every 40-45 seconds and 20 is about a minute in between. We are still running the same version of everything pasted before so nut is 2.6.5-2 and the kernel is 2.6.32-431.23.3.el6.x86_64. I tried looking through the code but I haven't found anything that would cause these disconnects, I do see where it tries to reconnect though. Does anybody else have any other ideas about this? Thanks! Shade Alabsa On Tue, Sep 30, 2014 at 8:33 AM, Shade Alabsa <shade34321 at gmail.com> wrote:> Charles, > > > If you run lsusb several times, does it still work? The exact > output of lsusb isn't as important as whether anything gets logged by > the kernel. Running lsusb shouldn't cause any extra kernel messages > such as the disconnection/reconnection messages shown here: > > Running lsusb seveeral times works just fine and no disconnects > are observed. > > > This doesn't seem to match the source code, which tries to claim > the interface up to three times, and if it doesn't work, it exits with > a fatal error. Your logs show the same PID for usbhid-ups, so it > apparently didn't exit. I am wondering if I am looking at the same > code as what is built on your system. Do you have the exact version > for the RPM files, or better yet, the corresponding SRPMs? > > For this testing we actually just have whatever comes installed > via yum on CentOS 6.5 in the EPEL I believe. Throughout this process I > can successfully run upsc to obtain the UPS system for a while, > roughly 24 hours before it starts reporting as stale and is no longer > accessible. Maybe I forgot to mention that, if so I'm sorry. Below is > the output of yum as I'm installing it from yum. > > > ==================================================================================================================> Package Arch Version > Repository Size > > ==================================================================================================================> Installing: > nut x86_64 2.6.5-2.el6 > epel 1.2 M > nut-client x86_64 2.6.5-2.el6 > epel 121 k > > > Thanks for all of your help! > > Shade > > On Mon, Sep 29, 2014 at 10:26 PM, Charles Lepple <clepple at gmail.com> > wrote: > > On Sep 29, 2014, at 12:50 PM, Shade Alabsa <shade34321 at gmail.com> wrote: > > > >> The lsusb command did not trigger a disconnect. The output of that > >> command is below. > > > > If you run lsusb several times, does it still work? The exact output of > lsusb isn't as important as whether anything gets logged by the kernel. > Running lsusb shouldn't cause any extra kernel messages such as the > disconnection/reconnection messages shown here: > > > > Sep 23 17:05:47 nemo kernel: usb 7-1: USB disconnect, device number 63 > > Sep 23 17:05:47 nemo kernel: usb 7-1: new low speed USB device number 64 > using uhci_hcd > > Sep 23 17:05:47 nemo kernel: usb 7-1: New USB device found, > idVendor=09ae, idProduct=3015 > > > >> I ran "usbhid-ups -a upsunit -DDD &> output.log" and > >> I have attached the /var/log/messages and output.log to this email. > >> Before running this test though I did clear out the messages so there > >> isn't a whole lot there. I also contacted Tripp-Lite today and they > >> are also looking into this. > > > > The part that really confuses me is this: > > > > Sep 23 17:06:05 nemo kernel: usb 7-1: usbfs: process 2291 (usbhid-ups) > did not claim interface 0 before use > > Sep 23 17:06:05 nemo kernel: usb 7-1: usbfs: process 2291 (usbhid-ups) > did not claim interface 0 before use > > > > This doesn't seem to match the source code, which tries to claim the > interface up to three times, and if it doesn't work, it exits with a fatal > error. Your logs show the same PID for usbhid-ups, so it apparently didn't > exit. I am wondering if I am looking at the same code as what is built on > your system. Do you have the exact version for the RPM files, or better > yet, the corresponding SRPMs? > > > > -- > > Charles Lepple > > clepple at gmail > > > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.alioth.debian.org/pipermail/nut-upsuser/attachments/20141118/d3d364d2/attachment.html>
Charles Lepple
2014-Nov-19 01:40 UTC
[Nut-upsuser] Tripp-Lite USB constantlly disconnecting.
On Nov 18, 2014, at 6:28 PM, Shade Alabsa <shade34321 at gmail.com> wrote:> I then noticed that in ups.conf I did not set the pollinterval, we normally set it to 15, so I set it to test on a RHEL 6.6 machine which caused this issue to happen again. On our own custom system we took the poll interval out and prelim testing doesn't showing the repeated disconnects/reconnects. I started playing more with this pollinterval and noticed that up to 14, exclusive, it seems to be fine.pollinterval defaults to 2, so maybe the UPS is disconnecting due to inactivity on the bus? (It shouldn't, but it might be a workaround for other systems.) The usbhid-ups driver splits the polling into two categories: "quick updates", to catch OL/OB/LB transitions, and "full updates" to catch everything else. It is possible that the higher pollinterval values are colliding with the pollfreq interval of 30 seconds. The UPS is requesting that the OS (or in the case of Linux, a program like NUT which uses libusb) poll the interrupt endpoint at least every 40 ms, which is fairly often, and would cause higher CPU load. The defaults in NUT strike a compromise. Note that the man page for the MGE SHUT driver does recommend a higher value of pollinterval, but that is a special case (it is essentially USB over a serial cable, and the baud rate is not high enough for the defaults) and wouldn't apply to your UPS. -- Charles Lepple clepple at gmail
Shade Alabsa
2014-Nov-19 02:45 UTC
[Nut-upsuser] Tripp-Lite USB constantlly disconnecting.
Charles, We did notice that there were two categories. Is there a way if we could determine if it is colliding? We do have a USB analyzer so I'll see if that can be used to determine if the UPS is disconnecting due to inactivity on the bus. Thanks! Shade On Tue, Nov 18, 2014 at 8:40 PM, Charles Lepple <clepple at gmail.com> wrote:> On Nov 18, 2014, at 6:28 PM, Shade Alabsa <shade34321 at gmail.com> wrote: > > > I then noticed that in ups.conf I did not set the pollinterval, we > normally set it to 15, so I set it to test on a RHEL 6.6 machine which > caused this issue to happen again. On our own custom system we took the > poll interval out and prelim testing doesn't showing the repeated > disconnects/reconnects. I started playing more with this pollinterval and > noticed that up to 14, exclusive, it seems to be fine. > > pollinterval defaults to 2, so maybe the UPS is disconnecting due to > inactivity on the bus? (It shouldn't, but it might be a workaround for > other systems.) > > The usbhid-ups driver splits the polling into two categories: "quick > updates", to catch OL/OB/LB transitions, and "full updates" to catch > everything else. It is possible that the higher pollinterval values are > colliding with the pollfreq interval of 30 seconds. > > The UPS is requesting that the OS (or in the case of Linux, a program like > NUT which uses libusb) poll the interrupt endpoint at least every 40 ms, > which is fairly often, and would cause higher CPU load. The defaults in NUT > strike a compromise. > > Note that the man page for the MGE SHUT driver does recommend a higher > value of pollinterval, but that is a special case (it is essentially USB > over a serial cable, and the baud rate is not high enough for the defaults) > and wouldn't apply to your UPS. > > -- > Charles Lepple > clepple at gmail > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.alioth.debian.org/pipermail/nut-upsuser/attachments/20141118/9f575493/attachment-0001.html>