Arnaud Quette
2011-Feb-15 13:16 UTC
[Nut-upsuser] [Bug 535583] Excessive logging by apcsmart program
2011/2/15 Lupe Christoph> On Monday, 2011-02-14 at 21:54:20 -0000, Arnaud Quette wrote: > > I definitely need more info! > > please reply to ALL: > > > - what is the exact model and date of manufacturing? > > SmartUPS 300I NET. I have the serial number (GS9809283199) but no date. >it seems to be a recent model.> - are you sure this unit is ok? > > You can't prove the absence of faults. >this was related to the following question...> > - have you really checked the cabling or made the whole (cable + UPS) > work > > somehow (using APC's software or apcupsd)? > > Well, as I said this is working OK for days or weeks. Then something > happens that triggers a bug in apcsmart. >quickly reading back the thread, I can't find these info...> - what is the meantime between occurrences of these issues? > > I don;t have enough data. It's in the range of weeks or months. >as per your previous posts, this seemed more to be a matter of minutes / hours.> - is the device reachable (using upsc for example) between issues? > > Sure, everything works fine. > > > A driver debug output is really needed! > > I'm running it again, but no promises. Reboots are much more frequent > than this misbehaviour. > > > Note that I'm not the developer of this driver, nor have any acquaintance > > with APC. > > Same here. Though I will probably try to locate this bug if we don;t > make progress with the debugging output, either because it does not tell > us enough or because I don't manage to capture it. > > I would have thought finding the place in the code where it is trying to > reset the UPS connection wouldn't be this hard. >this is not the problem. This code is in the smartmode() function of apcsmart.c: http://svn.debian.org/wsvn/nut/trunk/drivers/apcsmart.c we see the 5 attempts to go to smart mode ('Y' command), but my aim is to understand why it is failing, and how to cleanly solve this without impacting support for other units. Some more questions: - how are you handling the device's permissions? Refer to ? II, section 3: http://git.debian.org/?p=collab-maint/nut.git;a=blob_plain;f=debian/nut.README.Debian;hb=HEAD cheers Arnaud -- Linux / Unix Expert R&D - Eaton - http://powerquality.eaton.com Network UPS Tools (NUT) Project Leader - http://www.networkupstools.org/ Debian Developer - http://www.debian.org Free Software Developer - http://arnaud.quette.free.fr/ -- Conseiller Municipal - Saint Bernard du Touvet -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.alioth.debian.org/pipermail/nut-upsuser/attachments/20110215/8ff5917d/attachment.htm>
Lupe Christoph
2011-Feb-15 13:22 UTC
[Nut-upsuser] [Bug 535583] Excessive logging by apcsmart program
On Tuesday, 2011-02-15 at 14:16:58 +0100, Arnaud Quette wrote:> > I would have thought finding the place in the code where it is trying to > > reset the UPS connection wouldn't be this hard.> this is not the problem. This code is in the smartmode() function of > apcsmart.c: > http://svn.debian.org/wsvn/nut/trunk/drivers/apcsmart.cI'll have a look at that code.> we see the 5 attempts to go to smart mode ('Y' command), but my aim is to > understand why it is failing, and how to cleanly solve this without > impacting support for other units.Of course. The problem is that the program is sending the command infinitely, probably because of the EIO.> Some more questions: > - how are you handling the device's permissions? > Refer to ? II, section 3: > http://git.debian.org/?p=collab-maint/nut.git;a=blob_plain;f=debian/nut.README.Debian;hb=HEAD/etc/udev/rules.d/zzzlpc.rules: KERNEL=="ttyS2", OWNER="nut", GROUP="nut", MODE="0660" The serial line is on a PCI board. It may be a problem of that board, not the UPS. Which is cleared by closing the device. Lupe Christoph -- | It is a well-known fact in any organisation that, if you want a job | | done, you should give it to someone who is already very busy. | | Terry Pratchett, "Unseen Academicals" |
Arnaud Quette
2011-Apr-21 08:34 UTC
[Nut-upsdev] [Bug 535583] Excessive logging by apcsmart program
Hi Lupe, since we now have an apcsmart maintainer, I'm forwarding this issue to him. @Michal: could you please have a look at this issue [1], and give us your feeling? cheers, Arnaud -- [1] https://bugs.launchpad.net/bugs/535583 2011/2/15 Lupe Christoph <lupe at lupe-christoph.de>> On Tuesday, 2011-02-15 at 13:16:58 -0000, Arnaud Quette wrote: > > > this is not the problem. This code is in the smartmode() function of > > apcsmart.c: > > http://svn.debian.org/wsvn/nut/trunk/drivers/apcsmart.c > > > we see the 5 attempts to go to smart mode ('Y' command), but my aim is to > > understand why it is failing, and how to cleanly solve this without > > impacting support for other units. > > I found no code that does five attempts. But this code in main.c, > starting on Line 618: > > while (!exit_flag) { > > struct timeval timeout; > > gettimeofday(&timeout, NULL); > timeout.tv_sec += poll_interval; > > upsdrv_updateinfo(); > > while (!dstate_poll_fds(timeout, extrafd) && !exit_flag) { > /* repeat until time is up or extrafd has data */ > > > > upsdrv_updateinfo() calls smartmode(). > > dstate_poll_fds() checks if there is any file descriptor that is > "available". In our case: > > select(7, [4 5 6], NULL, NULL, {1, 999837}) = 1 (in [4], left {1, 999835}) > > FD 4 is the serial line, which is passed to dstate_poll_fds() as > extrafd. > > When there is data that can be read from the UPS no code in > dstate_poll_fds() reads from extrafd, there is only code that reads > from the other input FDs. The outer loop above also ignores extrafd. > exit_flag is never set, so it continues. And because there is an active > file descriptor, the select returns immediately (actually it takes two > microseconds). > > The solution is to add code that reads all data from extrafd and discards > it because nobody asked for it. I would also close and reopen the serial > line in smartmode(). I would prepare a patch if I knew more about the > I/O abstractions used in the nut driver code. Sorry. > > HTH, > Lupe Christoph > -- > | It is a well-known fact in any organisation that, if you want a job | > | done, you should give it to someone who is already very busy. | > | Terry Pratchett, "Unseen Academicals" | >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.alioth.debian.org/pipermail/nut-upsdev/attachments/20110421/a7ea7971/attachment.htm>