Arjen de Korte
2007-Jun-21 09:13 UTC
[Nut-upsdev] [nut-commits] svn commit r971 - in trunk: . drivers
> Author: aquette > Date: Thu Jun 21 07:43:46 2007 > New Revision: 971 > > Log: > fix communication lost status handling > > Modified: > trunk/ChangeLog > trunk/drivers/usbhid-ups.c > > Modified: trunk/ChangeLog > =============================================================================> --- trunk/ChangeLog (original) > +++ trunk/ChangeLog Thu Jun 21 07:43:46 2007 > @@ -1,3 +1,8 @@ > +Thu Jun 21 07:07:25 UTC 2007 / Arnaud Quette <aquette.dev at gmail.com> > + > + - usbhid-ups: improve the handling and reporting of the communication > + lost status. Previously, these were not reported anymore. > + > Web Jun 20 22:14:43 UTC 2007 / Carlos Rodrigues > <carlos.efr at mail.telepac.pt> > > - drivers/megatec.c, drivers/megatec_usb.c: reverted changes to > > Modified: trunk/drivers/usbhid-ups.c > =============================================================================> --- trunk/drivers/usbhid-ups.c (original) > +++ trunk/drivers/usbhid-ups.c Thu Jun 21 07:43:46 2007 > @@ -987,7 +987,7 @@ > > static void reconnect_ups(void) > { > - if (hd == NULL) > + while (hd == NULL)^^^^^^^^^^^^^^^^^^ If the communication to the UPS is lost (and can't be established again), it will stay in this loop forever and therefor never leave the upsdrv_updateinfo() function. This is bad for two reasons: #1 The driver won't respond to a termination signal anymore, since the exit_flag (that is set by a SIGTERM for instance) is not evaluated in upsdrv_updateinfo(). #2 The driver won't respond to the server either, so it will appear (and be flagged) dead, while the actual problem is in the connection to the UPS. People might try to restart it, which isn't going to work due to #1 mentioned above. We warn for this behaviour in 'docs/new-drivers.txt': ---8<--- upsdrv_updateinfo ----------------- Poll the hardware, and update any variables that you care about monitoring. Use dstate_setinfo() to store the new values. Do at most one pass of the variables. You MUST return from this function or upsd will be unable to read data from your driver. main will call this function at regular intervals. If your UPS hardware is slow or brain-damaged, you might use an odd/even approach to this function. Half the time, set a flag and return. The other half, clear the flag and actually do the poll. --->8--- Best regards, Arjen> { > upsdebugx(2, "=================================================="); > upsdebugx(2, "= device has been disconnected, try to reconnect ="); > @@ -1001,8 +1001,12 @@ > udev = NULL; > #endif > > - if ((hd = HIDOpenDevice(&udev, &curDevice, reopen_matcher, > MODE_REOPEN)) == NULL) > - dstate_datastale(); > + if ((hd = HIDOpenDevice(&udev, &curDevice, reopen_matcher, > MODE_REOPEN)) == NULL) { > + dstate_datastale(); > + > + /* Wait a bit before retrying... */ > + sleep(5); > + } > } > } > > > _______________________________________________ > nut-commits mailing list > nut-commits at lists.alioth.debian.org > http://lists.alioth.debian.org/mailman/listinfo/nut-commits > >-- Eindhoven - The Netherlands Key fingerprint - 66 4E 03 2C 9D B5 CB 9B 7A FE 7E C1 EE 88 BC 57
Arjen de Korte
2007-Jun-21 12:53 UTC
[Nut-upsdev] [nut-commits] svn commit r971 - in trunk: . drivers
> Author: aquette > Date: Thu Jun 21 07:43:46 2007 > New Revision: 971 > > Log: > fix communication lost status handling > > Modified: > trunk/ChangeLog > trunk/drivers/usbhid-ups.c > > Modified: trunk/ChangeLog > =============================================================================> --- trunk/ChangeLog (original) > +++ trunk/ChangeLog Thu Jun 21 07:43:46 2007 > @@ -1,3 +1,8 @@ > +Thu Jun 21 07:07:25 UTC 2007 / Arnaud Quette <aquette.dev at gmail.com> > + > + - usbhid-ups: improve the handling and reporting of the communication > + lost status. Previously, these were not reported anymore.Short of the problem with this patch I mentioned earlier today, I don't see how it is going to improve the situation. You can call dstate_datastale() as often as you like, as long as the status doesn't change, the driver will not report this to the server. This leaves the bad side effect that I mentioned earlier, the driver locks up until the communication with the UPS is re-established. I fear that this will lead to wrong conclusions, where people are looking for communication problems between driver and server, where they are actually between UPS and driver. As far as I can see, reconnect_ups() is called in two locations: 1) right after entering upsdrv_updateinfo() if 'hd' is found to be NULL 2) near the end of hid_ups_walk(), where 'hd' is made NULL just before calling reconnect_ups() Effectively, this means that as long as 'hd' stays NULL(), the reconnect_ups() function as it was, would be called every time upsdrv_updateinfo() runs. Typically, this will happen every two seconds, the default interval for polling a UPS. Since 'hd' will only be assigned a value in reconnect_ups() and upsdrv_initups(), this is the case, right? So it would try to re-establish the connection every so often. Best regards, Arjen -- Eindhoven - The Netherlands Key fingerprint - 66 4E 03 2C 9D B5 CB 9B 7A FE 7E C1 EE 88 BC 57
Arnaud Quette
2007-Jun-22 13:07 UTC
[Nut-upsdev] [nut-commits] svn commit r971 - in trunk: . drivers
2007/6/21, Arjen de Korte <nut+devel at de-korte.org>:> > > Author: aquette > > Date: Thu Jun 21 07:43:46 2007 > > New Revision: 971 > > > > Log: > > fix communication lost status handling > > > > Modified: > > trunk/ChangeLog > > trunk/drivers/usbhid-ups.c > > > > Modified: trunk/ChangeLog > > =============================================================================> > --- trunk/ChangeLog (original) > > +++ trunk/ChangeLog Thu Jun 21 07:43:46 2007 > > @@ -1,3 +1,8 @@ > > +Thu Jun 21 07:07:25 UTC 2007 / Arnaud Quette <aquette.dev at gmail.com> > > + > > + - usbhid-ups: improve the handling and reporting of the communication > > + lost status. Previously, these were not reported anymore. > > + > > Web Jun 20 22:14:43 UTC 2007 / Carlos Rodrigues > > <carlos.efr at mail.telepac.pt> > > > > - drivers/megatec.c, drivers/megatec_usb.c: reverted changes to > > > > Modified: trunk/drivers/usbhid-ups.c > > =============================================================================> > --- trunk/drivers/usbhid-ups.c (original) > > +++ trunk/drivers/usbhid-ups.c Thu Jun 21 07:43:46 2007 > > @@ -987,7 +987,7 @@ > > > > static void reconnect_ups(void) > > { > > - if (hd == NULL) > > + while (hd == NULL) > ^^^^^^^^^^^^^^^^^^ > If the communication to the UPS is lost (and can't be established again), > it will stay in this loop forever and therefor never leave the > upsdrv_updateinfo() function. This is bad for two reasons: > > #1 The driver won't respond to a termination signal anymore, since the > exit_flag (that is set by a SIGTERM for instance) is not evaluated in > upsdrv_updateinfo(). > > #2 The driver won't respond to the server either, so it will appear (and > be flagged) dead, while the actual problem is in the connection to the > UPS. People might try to restart it, which isn't going to work due to #1 > mentioned above. > > We warn for this behaviour in 'docs/new-drivers.txt': > ...oooch, fully right. thanks Arjen for your continued proof reading. it seems that my brain is currently too absorbed by the APC | MGE and now Eaton / Powerware things... http://www.eaton.com/EatonCom/Markets/Electrical/CT_127054 cleaner fix applied. -- Arnaud