David Zomaya
2019-Jun-18 17:47 UTC
[Nut-upsdev] [EXTERNAL] Re: Fixing Drops With SMART1500LCDXL & USB-HID Driver
Charles and Wolfy, Thanks for the replies. I added some responses below. I think I got the driver debugging right, but let me know if it is off. I should note: The CentOS 7.6 machine I have been testing with is a virtual machine (running on VMware ESXi 7.6). At least 1 customer and I have seen the same issue on VMs and physical boxes, so I don’t think that matters, but if it does let me know. “This is a little off-topic, but I would like to point out that not including a machine-readable serial number in the USB device descriptor makes it difficult for people to reliably use two or more UPSes on a single *nix system. Due to some complications with the way that USB devices are opened in libusb, there is no easy way to "open the next unused USB device", so we recommend that people match against the serial number.” Noted. I’ll kick this around internally and follow up. I know other units we make report the serial number. Maybe it’s a 2012 thing. “It would be interesting to see the debug log from usbhid-ups as well. It would give a little more context to the kernel errors. I haven't used a physical CentOS or RedHat system in a while, so I am not sure of the specifics needed to just stop the usbhid-ups driver, but then you can restart it with a few "-D" flags (3 should be sufficient for this kind of problem) and "-a TrippLiteUPS" to match this configuration. Please compress any log files (gzip preferred; zip works).” Attached. This is with the settings and udev rules mentioned in the earlier. I can remove and redo if useful. “pollinterval defaults to 2, and to be honest, for most other UPSes, we suggest that people raise the value (since many UPSes do not update their filtered values more frequently than that anyway).” Good to know on the default “2”. Reading upsmon.conf, I thought it was related to “POLLFREQ” which defaults to 30. “Do you know how frequently the Windows software polls the UPS?” For our PowerAlert Local software, generally we do half a second for USB (I’d need to check if that holds for all protocols, but I cannot think of any exceptions off the top of my head). In this particular case we’ve just been testing with baked-in Windows Power Options as a reference point here, so I’ll need to look into how frequently Windows polls. “Should this be applied to other models as well, or just protocol 2012?” I’ll dig into that and confirm. I’m not entirely convinced it is a protocol issue, but it very well could be. “The 62-nut-usbups.rules file looks pretty standard. Do you know if the changes to 42-usb-hd-pm.rules are needed? It seems like none of the USB devices would have the right permissions if 62-nut-usbups.rules isn't sufficient (though this happened in Debian once).” My means of testing wasn’t the most rigorous, but I did try to use variable isolation with these changes and some other changes. I could not make the drops stop without having all 3 of these changes present. I believe a web search lead me to this udev rule so I’ll dig up the link for context. “Note that NUT 2.7.4 has been out for some time now.” Wolfy nailed this one, I just installed whatever the repository gave me. Should I test with 2.7.4? “Each USB vendor ID generally gets their own source file, so we could add a special case to drivers/tripplite-hid.c. As mentioned earlier, if you know that this will be a problem across all protocol 2012 UPSes, we can check for that ID. I will say that there is a bit of a logjam in the release pipeline, due to some (unnecessary, IMHO) deprecation of libusb-0.1: https://github.com/networkupstools/nut/issues/300 So it's unclear when a code change will get to users. The configuration file changes should work in the mean time.” Cool, action item me to follow up with confirmation of the protocol as a valid means of identifying where this is needed. “I thought we did, but maybe I am confusing it with protocol 3016 devices. We actually added a lot of the protocol 2012 devices to the hardware compatibility list based on the test results that Eric provided, so I assume they worked then (about six years ago). The protocol 3016 devices (in particular, the SMART1500LCDT and OMNI1500LCDT) sometimes don't even stay on USB long enough to read a USB descriptor, and this does seem correlated with newer motherboards. Example: https://github.com/networkupstools/nut/issues/577 From where I stand, there really shouldn't be anything that a user-space program (like a NUT driver) can do that should be able to cause a USB device to disconnect during normal polling. (Aside from firmware updates, which we don't attempt.) That said, I recognize that USB Phy layer problems can be hard to diagnose, and power management can compound the issue.” Interesting threads. Are you still seeing issues with LCDTs or have they subsided? “Some users prefer not to post the entire serial numbers from their UPS when reporting issues. Is there a convention for the serial number digits such that we can ask for just the first few digits, and get an idea as to whether the problem is limited to a given hardware or firmware revision? There seems to be a firmware revision buried in the HID descriptor for some models, but I don't know how to interpret it, and some of these connection problems present themselves before the UPS can return that HID report.” Our serial numbers break down like this: https://www.tripplite.com/support/identify-products If they give you the first 13 characters of the serial number, you’d know the SKU and the datecode without having the full serial number. Firmware isn’t inherently baked into that though. i.e. you could have the same firmware on different SKUs. Does this help at all? “Although these models are not as common, we still hear from users with non-HID-PDC-based USB devices (and some serial UPSes as well). Publicly-available protocol documents would help us write better drivers for those devices. If not, a better way to identify models with proprietary protocols would be useful.” Does publicly available protocol = needs to be accessible to anyone Or We provide you with protocol docs if you agree not to share? I can look into the latter. I feel like we should be able to help here. “Another thing is considering how users get started with NUT. Sometimes a user inherits an UPS on a given system, and they want to set up NUT to monitor it. Ideally, we'd like to have a way for them to quickly triage whether a particular UPS model will work, and before they have NUT installed, they will likely have "lsusb" or similar tools to enumerate devices. Other times, a user is replacing another UPS, and they want to know which models are supported by NUT before purchasing one. In both of those cases, more information about how USB IDs map to models can help smooth out those processes. At the moment, we manually add each protocol number to the usbhid-ups driver when a user tries an UPS that isn't listed already. If there were a convention that all USB idDevice values in a certain range were going to be HID PDC compliant, we could change the default from opt-in to enabled-by-default (but we wouldn't want the UPS driver to try to control a USB hub).” Let me kick this around. If we were able to say: USB UPSes with IDs from “09ae:XXXX” to 09ae:YYYY” are PDC compliant, would that be enough? Topically the problem I can think of is determining if this holds for older units (I’ll dig into this). Thank you, David Zomaya 1111 W. 35th Street | Chicago, IL 60609 USA david_zomaya at tripplite.com -----Original Message----- From: Charles Lepple <clepple at gmail.com> Sent: Monday, June 17, 2019 9:28 PM To: David Zomaya <David_Zomaya at tripplite.com> Cc: nut-upsdev at alioth-lists.debian.net; Jonathan Manzanilla <Jonathan_Manzanilla at tripplite.com>; Eric Cobb <Eric_Cobb at tripplite.com> Subject: [EXTERNAL] Re: [Nut-upsdev] Fixing Drops With SMART1500LCDXL & USB-HID Driver This is an EXTERNAL email. Please take a moment and think before clicking any links or opening any attachments from this email. If suspicious, please forward to ishelpdesk at tripplite.com<mailto:ishelpdesk at tripplite.com> for review. ______________________________________________________________________ On Jun 17, 2019, at 3:00 PM, David Zomaya <David_Zomaya at tripplite.com<mailto:David_Zomaya at tripplite.com>> wrote:>> Hi Network UPS Tools Support,>> I’m not sure if this is a question for the “user group” or the developer group”.The config files will be useful for -users, but I'd say the development list is probably better for discussing potential changes.> My name is David Zomaya and I work at Tripp Lite in our technical support department. Copied on this email are Eric Cobb from our Product Management group & Jonathan Manzanilla tech support subject matter expert for our single phase UPS product lines.I recognize Eric's name from a few years ago - he emailed some detailed test results with NUT connecting to various Tripp-Lite models. Hi, Eric!> Recently, we received a complaint about our SMART1500LCDXL dropping and reconnecting in different Linux Operating Systems.A fter some in-house testing, the behavior seems to be reproducible on a number of different *nix operating systems (Windows seems fine). Here’s an example of the drops in /var/log/messages (I’ll use CentOS 7.6 as my reference point throughout this email):> May 29 19:25:27 localhost kernel: usb 2-2.1: new low-speed USB device> number 6 using uhci_hcd May 29 19:25:27 localhost kernel: usb 2-2.1:> New USB device found, idVendor=09ae, idProduct=2012 May 29 19:25:27> localhost kernel: usb 2-2.1: New USB device strings: Mfr=1, Product=2,> SerialNumber=0This is a little off-topic, but I would like to point out that not including a machine-readable serial number in the USB device descriptor makes it difficult for people to reliably use two or more UPSes on a single *nix system. Due to some complications with the way that USB devices are opened in libusb, there is no easy way to "open the next unused USB device", so we recommend that people match against the serial number.> May 29 19:25:27 localhost kernel: usb 2-2.1: Product: Tripp Lite UPS> May 29 19:25:27 localhost kernel: usb 2-2.1: Manufacturer: Tripp Lite> May 29 19:25:27 localhost kernel: hid-generic 0003:09AE:2012.0004:> hiddev0,hidraw1: USB HID v1.10 Device [Tripp Lite Tripp Lite UPS ] on> usb-0000:02:02.0-2.1/input0 May 29 19:25:29 localhost upsd[6287]: UPS> [TrippLiteUPS] data is no longer stale May 29 19:25:29 localhost upsd:> UPS [TrippLiteUPS] data is no longer stale May 29 19:25:45 localhost> kernel: usb 2-2.1: USB disconnect, device number 6It would be interesting to see the debug log from usbhid-ups as well. It would give a little more context to the kernel errors. I haven't used a physical CentOS or RedHat system in a while, so I am not sure of the specifics needed to just stop the usbhid-ups driver, but then you can restart it with a few "-D" flags (3 should be sufficient for this kind of problem) and "-a TrippLiteUPS" to match this configuration. Please compress any log files (gzip preferred; zip works).> As a result, this impacted the user’s ability use NUT software on their Linux hosts. After some trial and error (and a lot of search engine use), I was able to find that the following configuration changes/settings stop the drops and stabilize performance:> 1) This in the ups.conf file> pollinterval = 1pollinterval defaults to 2, and to be honest, for most other UPSes, we suggest that people raise the value (since many UPSes do not update their filtered values more frequently than that anyway). Do you know how frequently the Windows software polls the UPS? Should this be applied to other models as well, or just protocol 2012?> [TrippLiteUPS]> driver = usbhid-ups> port = auto> desc = "SMART1500LCD"> 2) The attached 62-nut-usbups.rules file at /etc/udev/rules.d/> 3) The attached 42-usb-hid-pm.rules /usr/lib/udev/rules.d/The 62-nut-usbups.rules file looks pretty standard. Do you know if the changes to 42-usb-hd-pm.rules are needed? It seems like none of the USB devices would have the right permissions if 62-nut-usbups.rules isn't sufficient (though this happened in Debian once).>> Below is some other information that may be relevant regarding my testing.>> · I installed using the command “yum install nut.x86_64”>> · Operating system version:> CentOS Linux release 7.6.1810 (Core)>> · Network UPS Tools version> Network UPS Tools upsd 2.7.2Note that NUT 2.7.4 has been out for some time now.>>> I’m not the most well-versed in Network UPS Tools, so I am not sure how “good” of a solution this is. I can however, get you more information on our product and testing if that helps.>> The questions I have are:> 1) Does the above seem like a “good” way to address this problem? (given that the drops are something we need to look into on our end)I can't argue with the results, though I would like to narrow it down a little (there may be other issues at play with the permissions in the udev files) and make sure that it is not coincidence.> 2) Is there a good way to get this fix implemented in the driver?Each USB vendor ID generally gets their own source file, so we could add a special case to drivers/tripplite-hid.c. As mentioned earlier, if you know that this will be a problem across all protocol 2012 UPSes, we can check for that ID. I will say that there is a bit of a logjam in the release pipeline, due to some (unnecessary, IMHO) deprecation of libusb-0.1: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_networkupstools_nut_issues_300&d=DwIFaQ&c=f9s1WCuF-N6cmD_YaZ7gBg&r=lhr3k4au5dVQgHY_iS-v_t9g8PHVkn8Px_wyaupZGfQ&m=TPFi7L4qzFl2kKkxVqafWA8kCGJbKJ25kIE31_dlUCA&s=FqKD29hKePQDc-0jwXa-ZvNuu0VdPPmBJ0lBzJIeTYo&e So it's unclear when a code change will get to users. The configuration file changes should work in the mean time.> 3) Have you had any reports of similar issues?I thought we did, but maybe I am confusing it with protocol 3016 devices. We actually added a lot of the protocol 2012 devices to the hardware compatibility list based on the test results that Eric provided, so I assume they worked then (about six years ago). The protocol 3016 devices (in particular, the SMART1500LCDT and OMNI1500LCDT) sometimes don't even stay on USB long enough to read a USB descriptor, and this does seem correlated with newer motherboards. Example: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_networkupstools_nut_issues_577&d=DwIFaQ&c=f9s1WCuF-N6cmD_YaZ7gBg&r=lhr3k4au5dVQgHY_iS-v_t9g8PHVkn8Px_wyaupZGfQ&m=TPFi7L4qzFl2kKkxVqafWA8kCGJbKJ25kIE31_dlUCA&s=KPe1NUkiOITUahTmJHr-PyCQRiWn2JLcU74O_RqQ_WI&e From where I stand, there really shouldn't be anything that a user-space program (like a NUT driver) can do that should be able to cause a USB device to disconnect during normal polling. (Aside from firmware updates, which we don't attempt.) That said, I recognize that USB Phy layer problems can be hard to diagnose, and power management can compound the issue.> 4) While we are communicating, are there any other open Tripp Lite items I could help your team(s) with? No promises, but if I can help I’d like to.Aside from those two USB issues, just a few other thoughts: Some users prefer not to post the entire serial numbers from their UPS when reporting issues. Is there a convention for the serial number digits such that we can ask for just the first few digits, and get an idea as to whether the problem is limited to a given hardware or firmware revision? There seems to be a firmware revision buried in the HID descriptor for some models, but I don't know how to interpret it, and some of these connection problems present themselves before the UPS can return that HID report. Although these models are not as common, we still hear from users with non-HID-PDC-based USB devices (and some serial UPSes as well). Publicly-available protocol documents would help us write better drivers for those devices. If not, a better way to identify models with proprietary protocols would be useful. Another thing is considering how users get started with NUT. Sometimes a user inherits an UPS on a given system, and they want to set up NUT to monitor it. Ideally, we'd like to have a way for them to quickly triage whether a particular UPS model will work, and before they have NUT installed, they will likely have "lsusb" or similar tools to enumerate devices. Other times, a user is replacing another UPS, and they want to know which models are supported by NUT before purchasing one. In both of those cases, more information about how USB IDs map to models can help smooth out those processes. At the moment, we manually add each protocol number to the usbhid-ups driver when a user tries an UPS that isn't listed already. If there were a convention that all USB idDevice values in a certain range were going to be HID PDC compliant, we could change the default from opt-in to enabled-by-default (but we wouldn't want the UPS driver to try to control a USB hub).> Thanks for your time.No problem, thanks for reaching out!> _______________________________________________> Nut-upsdev mailing list> Nut-upsdev at alioth-lists.debian.net<mailto:Nut-upsdev at alioth-lists.debian.net>> https://urldefense.proofpoint.com/v2/url?u=https-3A__alioth-2Dlists.de> bian.net_cgi-2Dbin_mailman_listinfo_nut-2Dupsdev&d=DwIFaQ&c=f9s1WCuF-N6cmD_YaZ7gBg&r=lhr3k4au5dVQgHY_iS-v_t9g8PHVkn8Px_wyaupZGfQ&m=TPFi7L4qzFl2kKkxVqafWA8kCGJbKJ25kIE31_dlUCA&s=O5M-Y8sKtux0JI-KsFfjhFPlq3nUqeMZ43wyk5yQMe8&e________________________________ This message is for the addressee's use only. It may contain confidential information. If you receive this message in error, please delete it and notify the sender. Tripp Lite disclaims all warranties and liabilities, and assumes no responsibility for viruses which may infect an email sent to you from Tripp Lite and which damage your electronic systems or information. It is your responsibility to maintain virus detection systems to prevent damage to your electronic systems and information. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://alioth-lists.debian.net/pipermail/nut-upsdev/attachments/20190618/17456f89/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: debuglogsmart1500NUT.log.gz Type: application/x-gzip Size: 28303 bytes Desc: debuglogsmart1500NUT.log.gz URL: <http://alioth-lists.debian.net/pipermail/nut-upsdev/attachments/20190618/17456f89/attachment-0001.bin>
David Zomaya
2019-Jun-18 18:35 UTC
[Nut-upsdev] [EXTERNAL] Re: Fixing Drops With SMART1500LCDXL & USB-HID Driver
One follow up note: I removed the “42-usb-hd-pm.rules” and rebooted and the drops reoccurred. I put the “42-usb-hd-pm.rules” back and rebooted and the drops stopped Thank you, David Zomaya Technical Support [cid:image005.png at 01D525DA.AE3932A0] 1111 W. 35th Street | Chicago, IL 60609 USA 773.869.1156 | david_zomaya at tripplite.com [cid:image006.png at 01D525DA.AE3932A0]<http://www.tripplite.com/> From: David Zomaya <David_Zomaya at tripplite.com> Sent: Tuesday, June 18, 2019 12:47 PM To: Charles Lepple <clepple at gmail.com> Cc: nut-upsdev at alioth-lists.debian.net; Jonathan Manzanilla <Jonathan_Manzanilla at tripplite.com>; Eric Cobb <Eric_Cobb at tripplite.com> Subject: RE: [EXTERNAL] Re: [Nut-upsdev] Fixing Drops With SMART1500LCDXL & USB-HID Driver Charles and Wolfy, Thanks for the replies. I added some responses below. I think I got the driver debugging right, but let me know if it is off. I should note: The CentOS 7.6 machine I have been testing with is a virtual machine (running on VMware ESXi 7.6). At least 1 customer and I have seen the same issue on VMs and physical boxes, so I don’t think that matters, but if it does let me know. “This is a little off-topic, but I would like to point out that not including a machine-readable serial number in the USB device descriptor makes it difficult for people to reliably use two or more UPSes on a single *nix system. Due to some complications with the way that USB devices are opened in libusb, there is no easy way to "open the next unused USB device", so we recommend that people match against the serial number.” Noted. I’ll kick this around internally and follow up. I know other units we make report the serial number. Maybe it’s a 2012 thing. “It would be interesting to see the debug log from usbhid-ups as well. It would give a little more context to the kernel errors. I haven't used a physical CentOS or RedHat system in a while, so I am not sure of the specifics needed to just stop the usbhid-ups driver, but then you can restart it with a few "-D" flags (3 should be sufficient for this kind of problem) and "-a TrippLiteUPS" to match this configuration. Please compress any log files (gzip preferred; zip works).” Attached. This is with the settings and udev rules mentioned in the earlier. I can remove and redo if useful. “pollinterval defaults to 2, and to be honest, for most other UPSes, we suggest that people raise the value (since many UPSes do not update their filtered values more frequently than that anyway).” Good to know on the default “2”. Reading upsmon.conf, I thought it was related to “POLLFREQ” which defaults to 30. “Do you know how frequently the Windows software polls the UPS?” For our PowerAlert Local software, generally we do half a second for USB (I’d need to check if that holds for all protocols, but I cannot think of any exceptions off the top of my head). In this particular case we’ve just been testing with baked-in Windows Power Options as a reference point here, so I’ll need to look into how frequently Windows polls. “Should this be applied to other models as well, or just protocol 2012?” I’ll dig into that and confirm. I’m not entirely convinced it is a protocol issue, but it very well could be. “The 62-nut-usbups.rules file looks pretty standard. Do you know if the changes to 42-usb-hd-pm.rules are needed? It seems like none of the USB devices would have the right permissions if 62-nut-usbups.rules isn't sufficient (though this happened in Debian once).” My means of testing wasn’t the most rigorous, but I did try to use variable isolation with these changes and some other changes. I could not make the drops stop without having all 3 of these changes present. I believe a web search lead me to this udev rule so I’ll dig up the link for context. “Note that NUT 2.7.4 has been out for some time now.” Wolfy nailed this one, I just installed whatever the repository gave me. Should I test with 2.7.4? “Each USB vendor ID generally gets their own source file, so we could add a special case to drivers/tripplite-hid.c. As mentioned earlier, if you know that this will be a problem across all protocol 2012 UPSes, we can check for that ID. I will say that there is a bit of a logjam in the release pipeline, due to some (unnecessary, IMHO) deprecation of libusb-0.1: https://github.com/networkupstools/nut/issues/300 So it's unclear when a code change will get to users. The configuration file changes should work in the mean time.” Cool, action item me to follow up with confirmation of the protocol as a valid means of identifying where this is needed. “I thought we did, but maybe I am confusing it with protocol 3016 devices. We actually added a lot of the protocol 2012 devices to the hardware compatibility list based on the test results that Eric provided, so I assume they worked then (about six years ago). The protocol 3016 devices (in particular, the SMART1500LCDT and OMNI1500LCDT) sometimes don't even stay on USB long enough to read a USB descriptor, and this does seem correlated with newer motherboards. Example: https://github.com/networkupstools/nut/issues/577 From where I stand, there really shouldn't be anything that a user-space program (like a NUT driver) can do that should be able to cause a USB device to disconnect during normal polling. (Aside from firmware updates, which we don't attempt.) That said, I recognize that USB Phy layer problems can be hard to diagnose, and power management can compound the issue.” Interesting threads. Are you still seeing issues with LCDTs or have they subsided? “Some users prefer not to post the entire serial numbers from their UPS when reporting issues. Is there a convention for the serial number digits such that we can ask for just the first few digits, and get an idea as to whether the problem is limited to a given hardware or firmware revision? There seems to be a firmware revision buried in the HID descriptor for some models, but I don't know how to interpret it, and some of these connection problems present themselves before the UPS can return that HID report.” Our serial numbers break down like this: https://www.tripplite.com/support/identify-products If they give you the first 13 characters of the serial number, you’d know the SKU and the datecode without having the full serial number. Firmware isn’t inherently baked into that though. i.e. you could have the same firmware on different SKUs. Does this help at all? “Although these models are not as common, we still hear from users with non-HID-PDC-based USB devices (and some serial UPSes as well). Publicly-available protocol documents would help us write better drivers for those devices. If not, a better way to identify models with proprietary protocols would be useful.” Does publicly available protocol = needs to be accessible to anyone Or We provide you with protocol docs if you agree not to share? I can look into the latter. I feel like we should be able to help here. “Another thing is considering how users get started with NUT. Sometimes a user inherits an UPS on a given system, and they want to set up NUT to monitor it. Ideally, we'd like to have a way for them to quickly triage whether a particular UPS model will work, and before they have NUT installed, they will likely have "lsusb" or similar tools to enumerate devices. Other times, a user is replacing another UPS, and they want to know which models are supported by NUT before purchasing one. In both of those cases, more information about how USB IDs map to models can help smooth out those processes. At the moment, we manually add each protocol number to the usbhid-ups driver when a user tries an UPS that isn't listed already. If there were a convention that all USB idDevice values in a certain range were going to be HID PDC compliant, we could change the default from opt-in to enabled-by-default (but we wouldn't want the UPS driver to try to control a USB hub).” Let me kick this around. If we were able to say: USB UPSes with IDs from “09ae:XXXX” to 09ae:YYYY” are PDC compliant, would that be enough? Topically the problem I can think of is determining if this holds for older units (I’ll dig into this). Thank you, David Zomaya 1111 W. 35th Street | Chicago, IL 60609 USA david_zomaya at tripplite.com<mailto:david_zomaya at tripplite.com> -----Original Message----- From: Charles Lepple <clepple at gmail.com<mailto:clepple at gmail.com>> Sent: Monday, June 17, 2019 9:28 PM To: David Zomaya <David_Zomaya at tripplite.com<mailto:David_Zomaya at tripplite.com>> Cc: nut-upsdev at alioth-lists.debian.net<mailto:nut-upsdev at alioth-lists.debian.net>; Jonathan Manzanilla <Jonathan_Manzanilla at tripplite.com<mailto:Jonathan_Manzanilla at tripplite.com>>; Eric Cobb <Eric_Cobb at tripplite.com<mailto:Eric_Cobb at tripplite.com>> Subject: [EXTERNAL] Re: [Nut-upsdev] Fixing Drops With SMART1500LCDXL & USB-HID Driver This is an EXTERNAL email. Please take a moment and think before clicking any links or opening any attachments from this email. If suspicious, please forward to ishelpdesk at tripplite.com<mailto:ishelpdesk at tripplite.com> for review. ______________________________________________________________________ On Jun 17, 2019, at 3:00 PM, David Zomaya <David_Zomaya at tripplite.com<mailto:David_Zomaya at tripplite.com>> wrote:>> Hi Network UPS Tools Support,>> I’m not sure if this is a question for the “user group” or the developer group”.The config files will be useful for -users, but I'd say the development list is probably better for discussing potential changes.> My name is David Zomaya and I work at Tripp Lite in our technical support department. Copied on this email are Eric Cobb from our Product Management group & Jonathan Manzanilla tech support subject matter expert for our single phase UPS product lines.I recognize Eric's name from a few years ago - he emailed some detailed test results with NUT connecting to various Tripp-Lite models. Hi, Eric!> Recently, we received a complaint about our SMART1500LCDXL dropping and reconnecting in different Linux Operating Systems.A fter some in-house testing, the behavior seems to be reproducible on a number of different *nix operating systems (Windows seems fine). Here’s an example of the drops in /var/log/messages (I’ll use CentOS 7.6 as my reference point throughout this email):> May 29 19:25:27 localhost kernel: usb 2-2.1: new low-speed USB device> number 6 using uhci_hcd May 29 19:25:27 localhost kernel: usb 2-2.1:> New USB device found, idVendor=09ae, idProduct=2012 May 29 19:25:27> localhost kernel: usb 2-2.1: New USB device strings: Mfr=1, Product=2,> SerialNumber=0This is a little off-topic, but I would like to point out that not including a machine-readable serial number in the USB device descriptor makes it difficult for people to reliably use two or more UPSes on a single *nix system. Due to some complications with the way that USB devices are opened in libusb, there is no easy way to "open the next unused USB device", so we recommend that people match against the serial number.> May 29 19:25:27 localhost kernel: usb 2-2.1: Product: Tripp Lite UPS> May 29 19:25:27 localhost kernel: usb 2-2.1: Manufacturer: Tripp Lite> May 29 19:25:27 localhost kernel: hid-generic 0003:09AE:2012.0004:> hiddev0,hidraw1: USB HID v1.10 Device [Tripp Lite Tripp Lite UPS ] on> usb-0000:02:02.0-2.1/input0 May 29 19:25:29 localhost upsd[6287]: UPS> [TrippLiteUPS] data is no longer stale May 29 19:25:29 localhost upsd:> UPS [TrippLiteUPS] data is no longer stale May 29 19:25:45 localhost> kernel: usb 2-2.1: USB disconnect, device number 6It would be interesting to see the debug log from usbhid-ups as well. It would give a little more context to the kernel errors. I haven't used a physical CentOS or RedHat system in a while, so I am not sure of the specifics needed to just stop the usbhid-ups driver, but then you can restart it with a few "-D" flags (3 should be sufficient for this kind of problem) and "-a TrippLiteUPS" to match this configuration. Please compress any log files (gzip preferred; zip works).> As a result, this impacted the user’s ability use NUT software on their Linux hosts. After some trial and error (and a lot of search engine use), I was able to find that the following configuration changes/settings stop the drops and stabilize performance:> 1) This in the ups.conf file> pollinterval = 1pollinterval defaults to 2, and to be honest, for most other UPSes, we suggest that people raise the value (since many UPSes do not update their filtered values more frequently than that anyway). Do you know how frequently the Windows software polls the UPS? Should this be applied to other models as well, or just protocol 2012?> [TrippLiteUPS]> driver = usbhid-ups> port = auto> desc = "SMART1500LCD"> 2) The attached 62-nut-usbups.rules file at /etc/udev/rules.d/> 3) The attached 42-usb-hid-pm.rules /usr/lib/udev/rules.d/The 62-nut-usbups.rules file looks pretty standard. Do you know if the changes to 42-usb-hd-pm.rules are needed? It seems like none of the USB devices would have the right permissions if 62-nut-usbups.rules isn't sufficient (though this happened in Debian once).>> Below is some other information that may be relevant regarding my testing.>> · I installed using the command “yum install nut.x86_64”>> · Operating system version:> CentOS Linux release 7.6.1810 (Core)>> · Network UPS Tools version> Network UPS Tools upsd 2.7.2Note that NUT 2.7.4 has been out for some time now.>>> I’m not the most well-versed in Network UPS Tools, so I am not sure how “good” of a solution this is. I can however, get you more information on our product and testing if that helps.>> The questions I have are:> 1) Does the above seem like a “good” way to address this problem? (given that the drops are something we need to look into on our end)I can't argue with the results, though I would like to narrow it down a little (there may be other issues at play with the permissions in the udev files) and make sure that it is not coincidence.> 2) Is there a good way to get this fix implemented in the driver?Each USB vendor ID generally gets their own source file, so we could add a special case to drivers/tripplite-hid.c. As mentioned earlier, if you know that this will be a problem across all protocol 2012 UPSes, we can check for that ID. I will say that there is a bit of a logjam in the release pipeline, due to some (unnecessary, IMHO) deprecation of libusb-0.1: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_networkupstools_nut_issues_300&d=DwIFaQ&c=f9s1WCuF-N6cmD_YaZ7gBg&r=lhr3k4au5dVQgHY_iS-v_t9g8PHVkn8Px_wyaupZGfQ&m=TPFi7L4qzFl2kKkxVqafWA8kCGJbKJ25kIE31_dlUCA&s=FqKD29hKePQDc-0jwXa-ZvNuu0VdPPmBJ0lBzJIeTYo&e So it's unclear when a code change will get to users. The configuration file changes should work in the mean time.> 3) Have you had any reports of similar issues?I thought we did, but maybe I am confusing it with protocol 3016 devices. We actually added a lot of the protocol 2012 devices to the hardware compatibility list based on the test results that Eric provided, so I assume they worked then (about six years ago). The protocol 3016 devices (in particular, the SMART1500LCDT and OMNI1500LCDT) sometimes don't even stay on USB long enough to read a USB descriptor, and this does seem correlated with newer motherboards. Example: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_networkupstools_nut_issues_577&d=DwIFaQ&c=f9s1WCuF-N6cmD_YaZ7gBg&r=lhr3k4au5dVQgHY_iS-v_t9g8PHVkn8Px_wyaupZGfQ&m=TPFi7L4qzFl2kKkxVqafWA8kCGJbKJ25kIE31_dlUCA&s=KPe1NUkiOITUahTmJHr-PyCQRiWn2JLcU74O_RqQ_WI&e From where I stand, there really shouldn't be anything that a user-space program (like a NUT driver) can do that should be able to cause a USB device to disconnect during normal polling. (Aside from firmware updates, which we don't attempt.) That said, I recognize that USB Phy layer problems can be hard to diagnose, and power management can compound the issue.> 4) While we are communicating, are there any other open Tripp Lite items I could help your team(s) with? No promises, but if I can help I’d like to.Aside from those two USB issues, just a few other thoughts: Some users prefer not to post the entire serial numbers from their UPS when reporting issues. Is there a convention for the serial number digits such that we can ask for just the first few digits, and get an idea as to whether the problem is limited to a given hardware or firmware revision? There seems to be a firmware revision buried in the HID descriptor for some models, but I don't know how to interpret it, and some of these connection problems present themselves before the UPS can return that HID report. Although these models are not as common, we still hear from users with non-HID-PDC-based USB devices (and some serial UPSes as well). Publicly-available protocol documents would help us write better drivers for those devices. If not, a better way to identify models with proprietary protocols would be useful. Another thing is considering how users get started with NUT. Sometimes a user inherits an UPS on a given system, and they want to set up NUT to monitor it. Ideally, we'd like to have a way for them to quickly triage whether a particular UPS model will work, and before they have NUT installed, they will likely have "lsusb" or similar tools to enumerate devices. Other times, a user is replacing another UPS, and they want to know which models are supported by NUT before purchasing one. In both of those cases, more information about how USB IDs map to models can help smooth out those processes. At the moment, we manually add each protocol number to the usbhid-ups driver when a user tries an UPS that isn't listed already. If there were a convention that all USB idDevice values in a certain range were going to be HID PDC compliant, we could change the default from opt-in to enabled-by-default (but we wouldn't want the UPS driver to try to control a USB hub).> Thanks for your time.No problem, thanks for reaching out!> _______________________________________________> Nut-upsdev mailing list> Nut-upsdev at alioth-lists.debian.net<mailto:Nut-upsdev at alioth-lists.debian.net>> https://urldefense.proofpoint.com/v2/url?u=https-3A__alioth-2Dlists.de> bian.net_cgi-2Dbin_mailman_listinfo_nut-2Dupsdev&d=DwIFaQ&c=f9s1WCuF-N6cmD_YaZ7gBg&r=lhr3k4au5dVQgHY_iS-v_t9g8PHVkn8Px_wyaupZGfQ&m=TPFi7L4qzFl2kKkxVqafWA8kCGJbKJ25kIE31_dlUCA&s=O5M-Y8sKtux0JI-KsFfjhFPlq3nUqeMZ43wyk5yQMe8&e________________________________ This message is for the addressee's use only. It may contain confidential information. If you receive this message in error, please delete it and notify the sender. Tripp Lite disclaims all warranties and liabilities, and assumes no responsibility for viruses which may infect an email sent to you from Tripp Lite and which damage your electronic systems or information. It is your responsibility to maintain virus detection systems to prevent damage to your electronic systems and information. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://alioth-lists.debian.net/pipermail/nut-upsdev/attachments/20190618/d53ae7d2/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.png Type: image/png Size: 13509 bytes Desc: image005.png URL: <http://alioth-lists.debian.net/pipermail/nut-upsdev/attachments/20190618/d53ae7d2/attachment-0002.png> -------------- next part -------------- A non-text attachment was scrubbed... Name: image006.png Type: image/png Size: 2661 bytes Desc: image006.png URL: <http://alioth-lists.debian.net/pipermail/nut-upsdev/attachments/20190618/d53ae7d2/attachment-0003.png>
Charles Lepple
2019-Jun-19 02:59 UTC
[Nut-upsdev] [EXTERNAL] Re: Fixing Drops With SMART1500LCDXL & USB-HID Driver
On Jun 18, 2019, at 1:47 PM, David Zomaya wrote:> > Charles and Wolfy, > > Thanks for the replies. > > I added some responses below. > I think I got the driver debugging right, but let me know if it is off. > > I should note: > The CentOS 7.6 machine I have been testing with is a virtual machine (running on VMware ESXi 7.6). At least 1 customer and I have seen the same issue on VMs and physical boxes, so I don’t think that matters, but if it does let me know. > >> “This is a little off-topic, but I would like to point out that not including a machine-readable serial number in the USB device descriptor makes it difficult for people to reliably use two or more UPSes on a single *nix system. Due to some complications with the way that USB devices are opened in libusb, there is no easy way to "open the next unused USB device", so we recommend that people match against the serial number.” >> > Noted. I’ll kick this around internally and follow up. I know other units we make report the serial number. Maybe it’s a 2012 thing.I think we saw a similar problem with APC UPSes not having string descriptors when attached to a VM, so that is worth checking on a physical box. If so, I apologize for jumping to the conclusion that the UPS was not providing the serial number.> > “It would be interesting to see the debug log from usbhid-ups as well. It would give a little more context to the kernel errors. I haven't used a physical CentOS or RedHat system in a while, so I am not sure of the specifics needed to just stop the usbhid-ups driver, but then you can restart it with a few "-D" flags (3 should be sufficient for this kind of problem) and "-a TrippLiteUPS" to match this configuration. Please compress any log files (gzip preferred; zip works).” > > Attached. This is with the settings and udev rules mentioned in the earlier. I can remove and redo if useful.The log format looks okay, but I may not have been clear that I was looking for the log at the same time as when the kernel errors occur. So without the udev rules would be good.> > “pollinterval defaults to 2, and to be honest, for most other UPSes, we suggest that people raise the value (since many UPSes do not update their filtered values more frequently than that anyway).” > > Good to know on the default “2”. Reading upsmon.conf, I thought it was related to “POLLFREQ” which defaults to 30.We've reworked some of the documentation for the next release, but the short answer is that USB HID drivers poll the essential status bits every `pollinterval` seconds ("Quick update..." in the log), and then grab the rest every `pollfreq` seconds ("Full update").> > “Do you know how frequently the Windows software polls the UPS?” > > For our PowerAlert Local software, generally we do half a second for USB (I’d need to check if that holds for all protocols, but I cannot think of any exceptions off the top of my head). In this particular case we’ve just been testing with baked-in Windows Power Options as a reference point here, so I’ll need to look into how frequently Windows polls.Thanks, that is useful to know.> “Should this be applied to other models as well, or just protocol 2012?” > > I’ll dig into that and confirm. I’m not entirely convinced it is a protocol issue, but it very well could be.(By "protocol" I am just referring to the idProduct field, rather than the protocol itself, since that is what the driver would eventually use to enable any special cases.)> > “The 62-nut-usbups.rules file looks pretty standard. Do you know if the changes to 42-usb-hd-pm.rules are needed? It seems like none of the USB devices would have the right permissions if 62-nut-usbups.rules isn't sufficient (though this happened in Debian once).” > > My means of testing wasn’t the most rigorous, but I did try to use variable isolation with these changes and some other changes. I could not make the drops stop without having all 3 of these changes present. I believe a web search lead me to this udev rule so I’ll dig up the link for context.This is starting to make sense, though. The link would be helpful, but no worries if you can't find it. I think you mentioned the CentOS version - which kernel version does that run? ("uname -r" is probably sufficient) I will try to read up on the USB power management settings that the udev file is changing.> “Note that NUT 2.7.4 has been out for some time now.” > > Wolfy nailed this one, I just installed whatever the repository gave me. Should I test with 2.7.4?Might not be necessary.>> “I thought we did, but maybe I am confusing it with protocol 3016 devices. We actually added a lot of the protocol 2012 devices to the hardware compatibility list based on the test results that Eric provided, so I assume they worked then (about six years ago). >> >> The protocol 3016 devices (in particular, the SMART1500LCDT and OMNI1500LCDT) sometimes don't even stay on USB long enough to read a USB descriptor, and this does seem correlated with newer motherboards. Example: https://github.com/networkupstools/nut/issues/577 >> >> From where I stand, there really shouldn't be anything that a user-space program (like a NUT driver) can do that should be able to cause a USB device to disconnect during normal polling. (Aside from firmware updates, which we don't attempt.) That said, I recognize that USB Phy layer problems can be hard to diagnose, and power management can compound the issue.” >> > Interesting threads. Are you still seeing issues with LCDTs or have they subsided?I moved my SMART1500LCDT off of the always-on system, so I don't have continuous data for it. I don't think it worked on Raspbian stretch the last time I tried. Others have rigged up scripts to reset a USB hub to simulate re-plugging the UPS USB connection.> > “Some users prefer not to post the entire serial numbers from their UPS when reporting issues. Is there a convention for the serial number digits such that we can ask for just the first few digits, and get an idea as to whether the problem is limited to a given hardware or firmware revision? There seems to be a firmware revision buried in the HID descriptor for some models, but I don't know how to interpret it, and some of these connection problems present themselves before the UPS can return that HID report.” > > Our serial numbers break down like this: > https://www.tripplite.com/support/identify-products > If they give you the first 13 characters of the serial number, you’d know the SKU and the datecode without having the full serial number. Firmware isn’t inherently baked into that though. i.e. you could have the same firmware on different SKUs. > Does this help at all?Very useful, thanks!> “Although these models are not as common, we still hear from users with non-HID-PDC-based USB devices (and some serial UPSes as well). Publicly-available protocol documents would help us write better drivers for those devices. If not, a better way to identify models with proprietary protocols would be useful.” > > Does publicly available protocol = needs to be accessible to anyone > Or > We provide you with protocol docs if you agree not to share? > I can look into the latter. I feel like we should be able to help here.I think that a lot of the content of the protocol docs would end up in the logic of the source code. Trying to write open-source code based on a document that can't be shared seems risky to me.>> “Another thing is considering how users get started with NUT. Sometimes a user inherits an UPS on a given system, and they want to set up NUT to monitor it. Ideally, we'd like to have a way for them to quickly triage whether a particular UPS model will work, and before they have NUT installed, they will likely have "lsusb" or similar tools to enumerate devices. Other times, a user is replacing another UPS, and they want to know which models are supported by NUT before purchasing one. In both of those cases, more information about how USB IDs map to models can help smooth out those processes. At the moment, we manually add each protocol number to the usbhid-ups driver when a user tries an UPS that isn't listed already. If there were a convention that all USB idDevice values in a certain range were going to be HID PDC compliant, we could change the default from opt-in to enabled-by-default (but we wouldn't want the UPS driver to try to control a USB hub).” >> > Let me kick this around. If we were able to say: USB UPSes with IDs from “09ae:XXXX” to 09ae:YYYY” are PDC compliant, would that be enough? > Topically the problem I can think of is determining if this holds for older units (I’ll dig into this).I think a range of IDs (even with a few exceptions for the older units) would be sufficient. The ideal scenario is that the range is somewhat future-proof, so a version of NUT from this year can properly identify next year's UPS. If not, we still have manual ways for users to add their idProduct to ups.conf and the udev files, but as you can imagine, that is frustrating for a new user.
Manuel Wolfshant
2019-Jun-19 06:39 UTC
[Nut-upsdev] [EXTERNAL] Re: Fixing Drops With SMART1500LCDXL & USB-HID Driver
On 6/19/19 5:59 AM, Charles Lepple wrote:>> >> “The 62-nut-usbups.rules file looks pretty standard. Do you know if the changes to 42-usb-hd-pm.rules are needed? It seems like none of the USB devices would have the right permissions if 62-nut-usbups.rules isn't sufficient (though this happened in Debian once).” >> >> My means of testing wasn’t the most rigorous, but I did try to use variable isolation with these changes and some other changes. I could not make the drops stop without having all 3 of these changes present. I believe a web search lead me to this udev rule so I’ll dig up the link for context. > This is starting to make sense, though. The link would be helpful, but no worries if you can't find it. > > I think you mentioned the CentOS version - which kernel version does that run? ("uname -r" is probably sufficient) > >CentOS 7x uses RedHat's idea of 3.10.0 .. which means it's a heavily patched 3.10. And by heavily I mean that in the 4 years since RHEL 7 was released, they added tons ( literally thousands ) of backports from 4.xx, including from 4.18 Latest available kernel in the CentOS 7.6 line is 3.10.0-957.21.2 but today we will probably release 3.10.0-957.21.3 which includes the fix for TCP SACK. The public beta of RHEL 7.7 uses 3.10.0-1049 but 7.7 GA will certainly use a newer release. I have already in use a beta of 3.10.0-1055 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://alioth-lists.debian.net/pipermail/nut-upsdev/attachments/20190619/868760f7/attachment.html>
David Zomaya
2019-Jun-19 20:16 UTC
[Nut-upsdev] [EXTERNAL] Re: Fixing Drops With SMART1500LCDXL & USB-HID Driver
Thanks for the inputs. I’ll get CentOS 7.6 installed on a physical machine this week and follow up. Few responses and notes below in the interim. ”The log format looks okay, but I may not have been clear that I was looking for the log at the same time as when the kernel errors occur. So without the udev rules would be good.” Ah, sorry. Attached are these two files: 1- norules.log.gz -Neither 62-nut-usbups.rules or 42-usb-hid-pm.rules at /etc/udev/rules.d/ or /usr/lib/udev/rules.d/. This really just left me with “permission denied”. I assume this is because 62-nut-usbups.rules is needed and #2 is the relevant file. 2- norule42.log.gz- 62-nut-usbups.rules at /lib/udev/rules.d/ and no “pollinterval = 1” value set in ups.conf. This yielded better results. I also included the output from /var/log/messages, lsusb –v, uname –r, and cat /etc/os-release. In the /var/log/messages output, you can see a few of the drops we are trying to prevent. “I think you mentioned the CentOS version - which kernel version does that run? ("uname -r" is probably sufficient)” 3.10.0-957.el7.x86_64 “I moved my SMART1500LCDT off of the always-on system, so I don't have continuous data for it. I don't think it worked on Raspbian stretch the last time I tried. Others have rigged up scripts to reset a USB hub to simulate re-plugging the UPS USB connection.” Ok, if issues come up going forward and I can help, feel free to reach out (same for any new Tripp Lite issues). “I think that a lot of the content of the protocol docs would end up in the logic of the source code. Trying to write open-source code based on a document that can't be shared seems risky to me.” That’s fair. I’d like to see us do something to help though. Let me see where this goes internally and follow up. “I think a range of IDs (even with a few exceptions for the older units) would be sufficient. The ideal scenario is that the range is somewhat future-proof, so a version of NUT from this year can properly identify next year's UPS. If not, we still have manual ways for users to add their idProduct to ups.conf and the udev files, but as you can imagine, that is frustrating for a new user.” Almost everything we make now is PDC compliant, so we should at least be able to do this. We’re looking into something easy to identify in the output of lsusb –v though since that could help with legacy support. Note: Interestingly, I did some tinkering with openSUSE 15.1 Leap on a physical box and saw the same drops there, but was able to get them to stop after installing NUT and just doing the “standard” configuration. However, I ran into some connection refused messages, e.g.: user at linux-nxmm:/dev/bus> sudo upsc -l Error: Connection failure: Connection refused When trying to pull data from the UPS, so I was clearly doing something wrong. But, the UPS stopped dropping (i.e. retained Device # in lsusb and no messages in /var/log/messages) . I did NOT have to make the changes to “42-hd-audio-pm.rules” to make the drops stop in this case. Given the different kernel (4.12.14-lp151.27-default) and newer version of (NUT 2.7.4) this may not be apples to apples though. I hope that I will be able to provide better data once I get to test further later this week. Topically, this has me thinking at the least I need to confirm the reproducibility of the fix I mentioned on a physical box. I will follow up by close of business Friday. Thank you, David Zomaya Tripp Lite 1111 W. 35th Street | Chicago, IL 60609 USA -----Original Message----- From: Charles Lepple <clepple at gmail.com> Sent: Tuesday, June 18, 2019 10:00 PM To: David Zomaya <David_Zomaya at tripplite.com> Cc: nut-upsdev at alioth-lists.debian.net; Jonathan Manzanilla <Jonathan_Manzanilla at tripplite.com>; Eric Cobb <Eric_Cobb at tripplite.com> Subject: Re: [EXTERNAL] Re: [Nut-upsdev] Fixing Drops With SMART1500LCDXL & USB-HID Driver On Jun 18, 2019, at 1:47 PM, David Zomaya wrote:>> Charles and Wolfy,>> Thanks for the replies.>> I added some responses below.> I think I got the driver debugging right, but let me know if it is off.>> I should note:> The CentOS 7.6 machine I have been testing with is a virtual machine (running on VMware ESXi 7.6). At least 1 customer and I have seen the same issue on VMs and physical boxes, so I don’t think that matters, but if it does let me know.>>> “This is a little off-topic, but I would like to point out that not including a machine-readable serial number in the USB device descriptor makes it difficult for people to reliably use two or more UPSes on a single *nix system. Due to some complications with the way that USB devices are opened in libusb, there is no easy way to "open the next unused USB device", so we recommend that people match against the serial number.”>>> Noted. I’ll kick this around internally and follow up. I know other units we make report the serial number. Maybe it’s a 2012 thing.I think we saw a similar problem with APC UPSes not having string descriptors when attached to a VM, so that is worth checking on a physical box. If so, I apologize for jumping to the conclusion that the UPS was not providing the serial number.>> “It would be interesting to see the debug log from usbhid-ups as well. It would give a little more context to the kernel errors. I haven't used a physical CentOS or RedHat system in a while, so I am not sure of the specifics needed to just stop the usbhid-ups driver, but then you can restart it with a few "-D" flags (3 should be sufficient for this kind of problem) and "-a TrippLiteUPS" to match this configuration. Please compress any log files (gzip preferred; zip works).”>> Attached. This is with the settings and udev rules mentioned in the earlier. I can remove and redo if useful.The log format looks okay, but I may not have been clear that I was looking for the log at the same time as when the kernel errors occur. So without the udev rules would be good.>> “pollinterval defaults to 2, and to be honest, for most other UPSes, we suggest that people raise the value (since many UPSes do not update their filtered values more frequently than that anyway).”>> Good to know on the default “2”. Reading upsmon.conf, I thought it was related to “POLLFREQ” which defaults to 30.We've reworked some of the documentation for the next release, but the short answer is that USB HID drivers poll the essential status bits every `pollinterval` seconds ("Quick update..." in the log), and then grab the rest every `pollfreq` seconds ("Full update").>> “Do you know how frequently the Windows software polls the UPS?”>> For our PowerAlert Local software, generally we do half a second for USB (I’d need to check if that holds for all protocols, but I cannot think of any exceptions off the top of my head). In this particular case we’ve just been testing with baked-in Windows Power Options as a reference point here, so I’ll need to look into how frequently Windows polls.Thanks, that is useful to know.> “Should this be applied to other models as well, or just protocol 2012?”>> I’ll dig into that and confirm. I’m not entirely convinced it is a protocol issue, but it very well could be.(By "protocol" I am just referring to the idProduct field, rather than the protocol itself, since that is what the driver would eventually use to enable any special cases.)>> “The 62-nut-usbups.rules file looks pretty standard. Do you know if the changes to 42-usb-hd-pm.rules are needed? It seems like none of the USB devices would have the right permissions if 62-nut-usbups.rules isn't sufficient (though this happened in Debian once).”>> My means of testing wasn’t the most rigorous, but I did try to use variable isolation with these changes and some other changes. I could not make the drops stop without having all 3 of these changes present. I believe a web search lead me to this udev rule so I’ll dig up the link for context.This is starting to make sense, though. The link would be helpful, but no worries if you can't find it. I think you mentioned the CentOS version - which kernel version does that run? ("uname -r" is probably sufficient) I will try to read up on the USB power management settings that the udev file is changing.> “Note that NUT 2.7.4 has been out for some time now.”>> Wolfy nailed this one, I just installed whatever the repository gave me. Should I test with 2.7.4?Might not be necessary.>> “I thought we did, but maybe I am confusing it with protocol 3016 devices. We actually added a lot of the protocol 2012 devices to the hardware compatibility list based on the test results that Eric provided, so I assume they worked then (about six years ago).>>>> The protocol 3016 devices (in particular, the SMART1500LCDT and>> OMNI1500LCDT) sometimes don't even stay on USB long enough to read a>> USB descriptor, and this does seem correlated with newer>> motherboards. Example:>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_netwo>> rkupstools_nut_issues_577&d=DwIFaQ&c=f9s1WCuF-N6cmD_YaZ7gBg&r=lhr3k4a>> u5dVQgHY_iS-v_t9g8PHVkn8Px_wyaupZGfQ&m=6vPDH1sqJhcwwVRECtQO7H6Tzh-L4I>> Yo9Ryzpa4A9IE&s=oV3u25jYPtbrcgAWLEAQujFDFzyrf1zXc7KsMK_XDaQ&e >>>> From where I stand, there really shouldn't be anything that a user-space program (like a NUT driver) can do that should be able to cause a USB device to disconnect during normal polling. (Aside from firmware updates, which we don't attempt.) That said, I recognize that USB Phy layer problems can be hard to diagnose, and power management can compound the issue.”>>> Interesting threads. Are you still seeing issues with LCDTs or have they subsided?I moved my SMART1500LCDT off of the always-on system, so I don't have continuous data for it. I don't think it worked on Raspbian stretch the last time I tried. Others have rigged up scripts to reset a USB hub to simulate re-plugging the UPS USB connection.>> “Some users prefer not to post the entire serial numbers from their UPS when reporting issues. Is there a convention for the serial number digits such that we can ask for just the first few digits, and get an idea as to whether the problem is limited to a given hardware or firmware revision? There seems to be a firmware revision buried in the HID descriptor for some models, but I don't know how to interpret it, and some of these connection problems present themselves before the UPS can return that HID report.”>> Our serial numbers break down like this:> https://www.tripplite.com/support/identify-products> If they give you the first 13 characters of the serial number, you’d know the SKU and the datecode without having the full serial number. Firmware isn’t inherently baked into that though. i.e. you could have the same firmware on different SKUs.> Does this help at all?Very useful, thanks!> “Although these models are not as common, we still hear from users with non-HID-PDC-based USB devices (and some serial UPSes as well). Publicly-available protocol documents would help us write better drivers for those devices. If not, a better way to identify models with proprietary protocols would be useful.”>> Does publicly available protocol = needs to be accessible to anyone Or> We provide you with protocol docs if you agree not to share?> I can look into the latter. I feel like we should be able to help here.I think that a lot of the content of the protocol docs would end up in the logic of the source code. Trying to write open-source code based on a document that can't be shared seems risky to me.>> “Another thing is considering how users get started with NUT. Sometimes a user inherits an UPS on a given system, and they want to set up NUT to monitor it. Ideally, we'd like to have a way for them to quickly triage whether a particular UPS model will work, and before they have NUT installed, they will likely have "lsusb" or similar tools to enumerate devices. Other times, a user is replacing another UPS, and they want to know which models are supported by NUT before purchasing one. In both of those cases, more information about how USB IDs map to models can help smooth out those processes. At the moment, we manually add each protocol number to the usbhid-ups driver when a user tries an UPS that isn't listed already. If there were a convention that all USB idDevice values in a certain range were going to be HID PDC compliant, we could change the default from opt-in to enabled-by-default (but we wouldn't want the UPS driver to try to control a USB hub).”>>> Let me kick this around. If we were able to say: USB UPSes with IDs from “09ae:XXXX” to 09ae:YYYY” are PDC compliant, would that be enough?> Topically the problem I can think of is determining if this holds for older units (I’ll dig into this).I think a range of IDs (even with a few exceptions for the older units) would be sufficient. The ideal scenario is that the range is somewhat future-proof, so a version of NUT from this year can properly identify next year's UPS. If not, we still have manual ways for users to add their idProduct to ups.conf and the udev files, but as you can imagine, that is frustrating for a new user. ________________________________ This message is for the addressee's use only. It may contain confidential information. If you receive this message in error, please delete it and notify the sender. Tripp Lite disclaims all warranties and liabilities, and assumes no responsibility for viruses which may infect an email sent to you from Tripp Lite and which damage your electronic systems or information. It is your responsibility to maintain virus detection systems to prevent damage to your electronic systems and information. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://alioth-lists.debian.net/pipermail/nut-upsdev/attachments/20190619/f78da1d8/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: norules.log.gz Type: application/x-gzip Size: 348 bytes Desc: norules.log.gz URL: <http://alioth-lists.debian.net/pipermail/nut-upsdev/attachments/20190619/f78da1d8/attachment-0002.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: norule42.log.gz Type: application/x-gzip Size: 16001 bytes Desc: norule42.log.gz URL: <http://alioth-lists.debian.net/pipermail/nut-upsdev/attachments/20190619/f78da1d8/attachment-0003.bin>
Apparently Analagous Threads
- [EXTERNAL] Fixing Drops With SMART1500LCDXL & USB-HID Driver
- [EXTERNAL] Re: Fixing Drops With SMART1500LCDXL & USB-HID Driver
- Fixing Drops With SMART1500LCDXL & USB-HID Driver
- Fixing Drops With SMART1500LCDXL & USB-HID Driver
- [EXTERNAL] Re: Fixing Drops With SMART1500LCDXL & USB-HID Driver