All, I have a Cyberpower UPS that I have been working with for about a year. I have used NUT in the past with good results, but as of late have been seeing issues with nut talking to the UPS. I will start with the information on the server that is running NUT, and has the UPS connected VIA an RS232 cable. O/S : Fedora 9 Kernal (uname -a): Linux haruhi 2.6.26.5-45.fc9.x86_64 #1 SMP Sat Sep 20 03:23:12 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux rpmquery -a | grep nut- nut-client-2.2.2-1.fc9.x86_64 nut-xml-2.2.2-1.fc9.x86_64 nut-2.2.2-1.fc9.x86_64 nut-cgi-2.2.2-1.fc9.x86_64 nut-devel-2.2.2-1.fc9.x86_64 Network UPS Tools - CyberPower driver 1.00 (2.2.2) Memory: 4 GB The details on the UPS: Model: CPS1500AVR Information on where I bought it (doubt this is needed, but it may be useful): http://www.newegg.com/Product/Product.aspx?Item=N82E16842102006 What I am seeing that is the problem now is that the driver starts correctly, and initially displays information back from the UPS, but on any subsequent reads of the UPS, I get the error: Mon Oct 27-11:55:52-root at haruhi:ups> upsc cyberpower-ups Error: Data stale I have just started seeing this error as the ups was moved to a newer system on a newer version of Fedora. the differences between the version was it was working on 2.1.x and not on 2.2.x. I would fall back to the old system but that was replaced due to catastrophic hardware failure, that wasn't power related. Any help would be appreciated though, and I can provide even more information if it is needed. Regards, Seann -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 5614 bytes Desc: S/MIME Cryptographic Signature Url : http://lists.alioth.debian.org/pipermail/nut-upsuser/attachments/20081027/03aee8ab/attachment.bin
Kjell Claesson
2008-Oct-27 18:41 UTC
[Nut-upsuser] Cyberpower/powerpanel error: Data stale
Den Monday 27 October 2008 17.59.44 skrev Seann Clark:> All, >Hi Seann, 8<--------------------------------snip---------------------------------> What I am seeing that is the problem now is that the driver starts > correctly, and initially displays information back from the UPS, but on > any subsequent reads of the UPS, I get the error: > Mon Oct 27-11:55:52-root at haruhi:ups> upsc cyberpower-ups > Error: Data staleI'm not so familiar with the cyberpower driver. But I looked in the svn and it seems that the powerpanel driver is on it's way to replace it. Think Arjen is the one that know all about this.> > I have just started seeing this error as the ups was moved to a > newer system on a newer version of Fedora. the differences between the > version was it was working on 2.1.x and not on 2.2.x. I would fall back > to the old system but that was replaced due to catastrophic hardware > failure, that wasn't power related. >If you can try out the powerpanel driver and report back what it say and if you get the same data stale. Or, is it possible that new setup inject noice into the communication cable? I hade this on one ups, where I got a ground-loop with the cable.> Any help would be appreciated though, and I can provide even more > information if it is needed. >Regards Kjell
Arjen de Korte
2008-Oct-28 16:18 UTC
[Nut-upsuser] Cyberpower/powerpanel error: Data stale
Citeren Seann Clark <nombrandue at tsukinokage.net>:> This output is after everything has been shut down and all NUT > related items killed out of memory after a init script being run to > stop the main items. The output of the debug is: > debug level is '3' > Trying binary protocol... > read: (20 bytes) => 2e 4f 50 31 35 30 30 20 20 20 20 2e 78 2e 3c 2e 35 31 > 30 30 > CyberPower UPS with binary protocol on /dev/ttyS0 detectedThe autodetection for the protocol works at least, it is reporting an OP1500 model.> send: (3 bytes) => 52 02 0d > read: (2 bytes) => 52 00 > send: (3 bytes) => 52 04 0d > read: (2 bytes) => 52 00 > send: (3 bytes) => 52 08 0d > read: (2 bytes) => 52 00 > send: (3 bytes) => 52 18 0d > read: (2 bytes) => 52 00Here a couple of settings from the UPS are read. It is using the table for the OP series, so this is correct too.> send: (2 bytes) => 44 0d > read: (14 bytes) => 23 7d 00 22 2e d6 80 b1 2e 09 f6 00 ff 0d > dstate_init: sock /var/run/nut/powerpanel-cyberpower-ups open on fd 6 > send: (2 bytes) => 44 0d > read: (14 bytes) => 23 7c 00 21 2e d6 7f b1 2e 09 f6 00 ff 0d > send: (2 bytes) => 44 0d > read: (14 bytes) => 23 7d 00 22 2e d6 80 b1 2e 09 f6 00 ff 0d > send: (2 bytes) => 44 0d > read: (14 bytes) => 23 7d 00 22 2e d6 80 b2 2e 09 f6 00 ff 0dWe succeeded to poll for the status four times in a row (without retries), so the communication is fine so far.> send: (2 bytes) => 44 0d > read: timed out > Communications with UPS lost: Status read failed!...but here the problems start. It looks like we suddenly have developed a communication problem here, the UPS doesn't want to talk/listen to us anymore. Usually this has to do with the cable power for the UPS. Currently, the 'powerpanel' driver clears RTS and sets DTR. I noticed that both the 'cyberpower' and 'nitram' drivers set both. Could you try if changing line 93 in powerpanel.c to ser_set_rts(upsfd, 1); fixes the problem? I suspect that clearing RTS might be a mistake here. Best regards, Arjen -- Please keep list traffic on the list
Arjen de Korte
2008-Oct-29 21:16 UTC
[Nut-upsuser] Cyberpower/powerpanel error: Data stale
Citeren Kjell Claesson <kjell.claesson at epost.tidanet.se>:> The driver looks OK, and changing the timing (delays) in it is not going to > help.There is one slight problem in it. The ser_get_buf_len() that is used will not differentiate between 'no characters read' and 'not enough characters read'. It will either return the requested number of characters or '-1'. This fooled me once again, so I think I will rework that part. Lines 382 - 385 in powerp-bin.c are basically a no-op now. I think the best way to fix this, would be to change the ser_get_buf_len() function in serial.c, as other drivers also seem to expect that on timeout the number of characters actually read are returned as well. I'll check with the other drivers that use this function (and possibly others as well). I guess most drivers will already check if the returned number of characters is what they expect, so this should have little impact. Something similar should be done for partial sending of data, although here quite a couple of drivers don't seem to bother checking the return code of the ser_send_* functions at all. Best regards, Arjen -- Please keep list traffic on the list
Arjen de Korte
2008-Oct-29 21:17 UTC
[Nut-upsuser] Cyberpower/powerpanel error: Data stale
Citeren Seann Clark <nombrandue at tsukinokage.net>:> After compiling the modified driver I am seeing this: > > Tue Oct 28-11:33:30-root at haruhi:drivers> ./powerpanel -DDD -u nut -a > cyberpower-ups > Network UPS Tools - CyberPower text/binary protocol UPS driver 0.23 (2.2.2) > Warning: This is an experimental driver. > Some features may not function correctly. > > debug level is '3' > Trying binary protocol... > read: (4 bytes) => 2e 09 f6 ff > Expected 20 bytes but only got 4 > read: (20 bytes) => 2e 4f 50 31 35 30 30 20 20 20 20 2e 78 2e 3c 2e 35 31 > 30 30 > CyberPower UPS with binary protocol on /dev/ttyS0 detected[...]> send: (2 bytes) => 44 0d > read: timed out > Communications with UPS lost: Status read failed! > send: (2 bytes) => 44 0d > read: timed out > Communications with UPS lost: Status read failed! > send: (2 bytes) => 44 0d > read: timed out > Communications with UPS lost: Status read failed! > send: (2 bytes) => 44 0d > read: timed out > Communications with UPS lost: Status read failed! > send: (2 bytes) => 44 0d > read: timed out > Communications with UPS lost: Status read failed! > ^CSignal 2: exiting > > So that didn't fix the issue, but judging from your statements, > related to it, it wasn't expected to really solve it, more just to > change it to be more like the old driver that 'works'.In fact, it only got worse (the first read now fails). So this is not a good idea, we should leave RTS as it is.> It looks like it got a few more reads into that though before > failing on the status.That's probably just a coincidence. What might be useful now, is to check on this system if nut-2.2.1 is able to keep the connection alive. As Kjell already wrote that this is probably a regression in the kernel, so I don't have high hopes it will. If you really feel like digging into this, running the driver through 'strace' might give some useful info. Best regards, Arjen -- Please keep list traffic on the list
Arjen de Korte
2008-Oct-30 20:19 UTC
[Nut-upsuser] Cyberpower/powerpanel error: Data stale
Citeren Kjell Claesson <kjell.claesson at epost.tidanet.se>: [ crossposting to nut-upsdev ]> Hm, maybe I'm wrong this time too. The answer is about 120 mS and the delay > for answer in the read-buf is 250 mS, then a respond delay (sleep) of 200 mS. > > So 250-120=130+200=330 mS > > So the ups have about 330 mS to respond. This should be more than enough, > but I have seen slower equipment.Actually, it is even longer. :-) The delay after the last character is written to the serial port is 50ms. After that the driver sleeps for 200ms. When it wakes up again, within 250ms there must be something available for reading (not necessarily the full status, a single character would be sufficient). Sending one character to the UPS takes about 8ms, so the UPS has close to half a second for the first character of the reply. After reception of the first character, each subsequent read is subject to a new timeout of 250ms. Theoretically this means that it may take up to 100 + 200 + 14 x 250 = 3.8 seconds for a status poll in this driver. Nevertheless, I think it would be worthwhile to make a difference between real timeouts (where we don't get any reply at all) and failure to read full replies (which could also be due to receiver overruns). I will check the compatibility of such a change with the existing drivers before making this change. Best regards, Arjen -- Please keep list traffic on the list