Null
2015-Oct-23 06:03 UTC
[Nut-upsuser] I/O errors with usbhid-ups and Tripp Lite SMART1500LCDT
Hi List, I recently picked up a Tripp Lite SMART1500LCDT while they were on sale to pair with my home server, and have tried getting it set up with NUT, since it seems to be supported. I can get the whole NUT stack up and running, but the driver seems to only stay up for a couple minutes before crashing out. I've uploaded the level 6 debug output of the driver (with my serial number removed) here: https://wuffleton.com:10101/paste/VKP5olfp#IaqkLAyzbMDnrervAofVxqggDv3KP2B-CoUT1w+486a From what I can gather - it runs into an I/O error, and when it tries to reconnect it's unable to claim the device because it's still busy. Are there any settings I can set in ups.conf to workaround this or any places I should start looking for issues in my configuration? This doesn't seem to be a permissions thing, since I'm seeing the same issue if I run usbhid-ups as root. I've also explicitly given the 'ups' user that I'm running NUT as permissions to the UPS device via udev. Environment for reference: NUT Version: v2.7.3.r128.g96c93fb (GIT); v2.7.3 stable exhibits the same behavior OS: Arch Linux x86_64 Kernel Version: 4.2.4.201510222059-1-grsec Would really love to get this working, and any help would be greatly appreciated!
Charles Lepple
2015-Oct-23 12:42 UTC
[Nut-upsuser] I/O errors with usbhid-ups and Tripp Lite SMART1500LCDT
[please use reply-all to include the list - the NUT lists do not add a reply-to header.] On Oct 23, 2015, at 2:03 AM, Null <null at wuffleton.com> wrote:> > Hi List, > > I recently picked up a Tripp Lite SMART1500LCDT while they were on sale to pair with my home server, and have tried getting it set up with NUT, since it seems to be supported. I can get the whole NUT stack up and running, but the driver seems to only stay up for a couple minutes before crashing out. > > I've uploaded the level 6 debug output of the driver (with my serial number removed) here: https://wuffleton.com:10101/paste/VKP5olfp#IaqkLAyzbMDnrervAofVxqggDv3KP2B-CoUT1w+486aHere's the relevant portion: 38.788121 =================================================38.788129 = device has been disconnected, try to reconnect = 38.788137 =================================================38.788203 Checking device (1D6B/0002) (002/001) 38.788239 Failed to open device, skipping. (Permission denied) 38.788249 Checking device (1D6B/0001) (008/001) 38.788267 Failed to open device, skipping. (Permission denied) 38.788277 Checking device (09AE/3016) (007/003) 38.823794 - VendorID: 09ae 38.823813 - ProductID: 3016 38.823821 - Manufacturer: Tripp Lite 38.823829 - Product: TRIPP LITE UPS 38.823837 - Serial Number: <snip> 38.823845 - Bus: 007 38.823853 - Device release number: 0002 38.823860 Trying to match device 38.823882 Device matches 38.823898 failed to claim USB device: Device or resource busy 38.823916 failed to detach kernel driver from USB device: No such file or directory 38.823927 failed to claim USB device: Device or resource busy 38.823938 failed to detach kernel driver from USB device: No such file or directory 38.823949 failed to claim USB device: Device or resource busy 38.823960 failed to detach kernel driver from USB device: No such file or directory 38.823971 failed to claim USB device: Device or resource busy 38.823982 failed to detach kernel driver from USB device: No such file or directory 38.823991 Can't claim USB device [09ae:3016]: No such file or directory 38.824002 upsdrv_cleanup... (Those messages are printed at debug level 2, so that's probably all you need for future logs.) I am curious to see if there is a "sweet spot" in terms of when the driver reconnects (not too early vs. not too late). At one point, I think we had a report of an UPS that needed to see a USB connection within a certain amount of time, or it wouldn't connect. This was in the context of boot times.> From what I can gather - it runs into an I/O error, and when it tries to reconnect it's unable to claim the device because it's still busy. Are there any settings I can set in ups.conf to workaround this or any places I should start looking for issues in my configuration? This doesn't seem to be a permissions thing, since I'm seeing the same issue if I run usbhid-ups as root. I've also explicitly given the 'ups' user that I'm running NUT as permissions to the UPS device via udev.For reconnections at startup, we have the following options: http://www.networkupstools.org/docs/man/ups.conf.html#_global_directives We might want to consider applying a similar delay to the auto-reconnect code in the drivers. Does `dmesg` show any USB messages around the time of the "No such file or directory" errors? This seems like a similar set of symptoms: http://article.gmane.org/gmane.comp.monitoring.nut.user/8662> Environment for reference: > NUT Version: v2.7.3.r128.g96c93fb (GIT); v2.7.3 stable exhibits the same behavior > OS: Arch Linux x86_64 > Kernel Version: 4.2.4.201510222059-1-grsecSo the interesting part is that I was testing the master branch yesterday (same commit) on an older Debian box (3.16 kernel). The driver had trouble starting, but once it got going, it has been running continuously. I'm not really sold on this approach (due to the testing needed on other hardware), but this pull request reworks the USB communication so that the interface is only claimed while it is being polled: https://github.com/networkupstools/nut/pull/122 Even though Github is reporting a potential merge conflict, it should be possible to check out that branch, and see if the driver is more stable in that version. If nothing else, there are some value scaling fixes for this model in that pull request. The HID descriptor in this UPS is a bit of a mess. (6kHz Hz power would be... interesting.)> Would really love to get this working, and any help would be greatly appreciated! > > _______________________________________________ > Nut-upsuser mailing list > Nut-upsuser at lists.alioth.debian.org > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/nut-upsuser-- Charles Lepple clepple at gmail
Charles Lepple
2015-Oct-25 02:25 UTC
[Nut-upsuser] I/O errors with usbhid-ups and Tripp Lite SMART1500LCDT
On Oct 23, 2015, at 8:42 AM, Charles Lepple <clepple at gmail.com> wrote:> >> Environment for reference: >> NUT Version: v2.7.3.r128.g96c93fb (GIT); v2.7.3 stable exhibits the same behavior >> OS: Arch Linux x86_64 >> Kernel Version: 4.2.4.201510222059-1-grsec > > So the interesting part is that I was testing the master branch yesterday (same commit) on an older Debian box (3.16 kernel). The driver had trouble starting, but once it got going, it has been running continuously.I'm starting to think this is motherboard-dependent. I now have the SMART1500LCDT plugged into a HP Z800 (Xeon 5xxx) motherboard on Debian jessie (also 3.16, but might be slightly different than the other box), and it is disconnecting frequently. (The stable system is a Dell Core i5.) -- Charles Lepple clepple at gmail
Null
2015-Oct-26 00:33 UTC
[Nut-upsuser] I/O errors with usbhid-ups and Tripp Lite SMART1500LCDT
On 2015-10-23 5:42, Charles Lepple wrote:> So the interesting part is that I was testing the master branch yesterday (same commit) on an older Debian box (3.16 kernel). The driver had trouble starting, but once it got going, it has been running continuously. > > I'm not really sold on this approach (due to the testing needed on other hardware), but this pull request reworks the USB communication so that the interface is only claimed while it is being polled: > > https://github.com/networkupstools/nut/pull/122 > > Even though Github is reporting a potential merge conflict, it should be possible to check out that branch, and see if the driver is more stable in that version. > > If nothing else, there are some value scaling fixes for this model in that pull request. The HID descriptor in this UPS is a bit of a mess. (6kHz Hz power would be... interesting.)The driver seems to be much more stable in this branch. I'm able to get the driver up and running on the first shot, and it's been up for at least a few hours now without any issues. The scaling fixes also definitely work, since I'm seeing sane values for everything when I use upsc to query the UPS. On 2015-10-24 19:25, Charles Lepple wrote:> I'm starting to think this is motherboard-dependent. I now have the SMART1500LCDT plugged into a HP Z800 (Xeon 5xxx) motherboard on Debian jessie (also 3.16, but might be slightly different than the other box), and it is disconnecting frequently. (The stable system is a Dell Core i5.)Seeing as the Dell C2100 server I'm running it with has a very similar chipset to the Z800 (Intel 5500 vs 5520, which lspci considers to be effectively the same), I wouldn't rule this out. So we've got a couple more samples with different chipsets and this model of UPS, I'll test it against my Haswell i3 laptop, and my Raspberry Pi2 later this week to see if they behave any differently against the current master branch.