Hello, I am currently working on the nut scanner. For detecting available upsd on the network, I rely on upscli_connect. The problem with this function is that it calls a blocking "connect" function. So the nut-scanner is blocked while waiting for the TCP timeout when trying to connect to an IP without upsd available . Since this timeout may be rather long (3 minutes on my host), I would like to add a timeout parameter to upscli_connect. I propose to add a upscli_tryconnect function accepting a timeout parameter, which will be the copy of the current upscli_connect + the management of the timeout. The upscli_connect will be only a wrapper on top of upscli_tryconnect, calling it without timeout. Please let me know if this makes sense to you. See a diff in attachment. Regards, Fred -- Team Open Source Eaton - http://powerquality.eaton.com -------------------------------------------------------------------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: diff_upsclient Type: text/x-patch Size: 2000 bytes Desc: not available URL: <http://lists.alioth.debian.org/pipermail/nut-upsdev/attachments/20110628/1127f4d1/attachment.bin>
On Jun 28, 2011, at 4:51 AM, Fr?d?ric Boh? wrote:> I propose to add a upscli_tryconnect function accepting a timeout > parameter, which will be the copy of the current upscli_connect + the > management of the timeout. The upscli_connect will be only a wrapper > on > top of upscli_tryconnect, calling it without timeout. > Please let me know if this makes sense to you.SOCK_NONBLOCK is non-portable - you can use fcntl to set O_NONBLOCK on the socket after creating it, but before connecting. You will also want to increment the libupsclient version number at the end of clients/Makefile.am due to the additional function: http://www.gnu.org/software/libtool/manual/libtool.html#Updating-version-info Other than that, it sounds good to me.
Citeren Fr?d?ric Boh? <fredericbohe at eaton.com>:> I am currently working on the nut scanner. For detecting available upsd > on the network, I rely on upscli_connect. The problem with this function > is that it calls a blocking "connect" function.This is a serious problem, not only for the nut-scanner but for basically all nut clients we currently bundle.> So the nut-scanner is > blocked while waiting for the TCP timeout when trying to connect to an > IP without upsd available . Since this timeout may be rather long (3 > minutes on my host), I would like to add a timeout parameter to > upscli_connect.I don't think this is a good idea.> I propose to add a upscli_tryconnect function accepting a timeout > parameter, which will be the copy of the current upscli_connect + the > management of the timeout. The upscli_connect will be only a wrapper on > top of upscli_tryconnect, calling it without timeout. > Please let me know if this makes sense to you.The upscli_connnect() call should not block. If it does, that is a problem that needs fixing, rather than adding an timeout. On what kind of system did you test this? If I attempt to connect upsmon to a non-existing server socket, the following is shown (-DDD): 0.001902 Trying to connect to UPS [myups at localhost] 0.002667 UPS [myups at localhost]: connect failed: Connection failure: Network is unreachable 0.002690 do_notify: ntype 0x0005 (COMMBAD) 0.002704 Communications with UPS myups at localhost lost Best regards, Arjen -- Please keep list traffic on the list (off-list replies will be rejected)
On 06/28/2011 02:29 PM, Arjen de Korte wrote:> > Citeren Fr?d?ric Boh? <fredericbohe at eaton.com>: > >> I am currently working on the nut scanner. For detecting available upsd >> on the network, I rely on upscli_connect. The problem with this function >> is that it calls a blocking "connect" function. > > The upscli_connnect() call should not block. If it does, that is a > problem that needs fixing, rather than adding an timeout. On what kind > of system did you test this? If I attempt to connect upsmon to a > non-existing server socket, the following is shown (-DDD): > > 0.001902 Trying to connect to UPS [myups at localhost] > 0.002667 UPS [myups at localhost]: connect failed: Connection > failure: Network is unreachable > 0.002690 do_notify: ntype 0x0005 (COMMBAD) > 0.002704 Communications with UPS myups at localhost lost >Your test is with localhost - so any non-existent socket would get immediately rejected. The nut scanner has to deal with hosts on the network. If they send an ICMP reject, then there is no need for a timeout. But it is very common for a host to simply "eat" attempts to connect to a port not allowed in their firewall. So nut scanner needs a non-blocking version of upscli_connect. Instead of a timeout, which is problematic (host may just be slow), I suggest an actual non-blocking call so that N connects may be started, and then waited on as a group using select() or os equivalent. The upscli_connect_start(...) should return a small integer (e.g. file descriptor), so the app can track the pending connect in a data structure (along with IP, port, etc), and upscli_connect_wait() will return the small integer of a pending connect that finished, and upscli_connect_finish(int) will complete the operation. There may still be a need for an overall timeout (passed to upscli_connect_start), since 3 minutes is too long to wait. Something more like 30 seconds would be appropriate for nut scanner.