Hello gents, while getting rid of gettimeofday, I've spotted something I'd like to discuss in implementation of socket polling in driver/main.c::main and driver/dstate.c::dstate_poll_fds My observations follow: 1/ I'm not very happy with the poll mechanism design as whole. I mean, typically, you implement the poll as { poll(); process(); }* loop; in pseudo-code: shut_down = false; fd_set.init(...); timeout.set(...); while (!shut_down) { status = poll(fd_set, timeout); switch (status) { case POLL_ERROR: // Handle error ... break; case POLL_TIMEOUT: // Handle timeout ... timeout.set(...); break; case POLL_EVENT: // Handle data ... break; } I.e. you typically have a poll function call (select is used in NUT), and you process the result of it in a loop. In driver/main.c, this is done like this { poll_and_process(); }*, i.e. there's an empty loop around function dstate_poll_fds which calls select, processes the result and returns. This approach has the drawback that logic of the { poll(); process(); }* loop is cut by an interface; you need to propagate info about poll result through that interface, which is problematic and ugly. 2/ There is actually possibility of busy loop in case the socket becomes invalid (and any decent code should always count with that). If this happen, select will return and the dstate_poll_fds function will return almost immediately, too (after logging a message). Almost all the time, the return code shall be 0 (timeout didn't occur). This means that the dstate_poll_fds function will be immediately re-executed, select will almost immediately return... Busy loop. 3/ Setting of the timeout is just weird. I actually believe that the intention was completely different than what's implemented. The dstate_poll_fds function 1st argument is the struct timeval passed by value(!) Ergo, each time it's called in the { poll_and_process() }* loop, it's simply set to the same time (in future). The function then computes a difference with the actual time and uses the result as select timeout... I.e, if the poll_interval was simply passed to the function and directly used as select timeout, the result would be exactly the same. I think this definitely isn't the intended functionality; obviously, what was ment is to poll at most poll_interval seconds, then update data and poll again; i.e. the timeout should've been propagated through the dstate_poll_fds interface while being updated by select. If done like this, the code would actually make sense, although... 4/ The propagated timeout MUST NOT be passed directly to select ... although there is one small problem: in case of error, select invalidates BOTH the FD set AND the timeout. So, the timeout may actually get set to just weird very high numbers, which will painfully mess the next (and further) polling. No problem, though, we just need to copy the timeout for select. However, this whole mechanism is just strange; hence my point 1/. Not to mention that it can't really work, now (because of the timeout copying with every call). I think this code should be reviewed. For now, I'm going to just fix it so that it works as probably intended, before, but that won't fix neither the potential busy loop problem nor the broken { poll(); process() }* polling logic. I don't want to rewrite that in scope of "getting rid of obsolete gettimeofdate" task; I think a feature req. or something is due way. Comments? Ideas? Recommendations? All welcome... Thanks, Regards, vasek -- Vaclav Krpec Network UPS Tools Project Eaton Opensource Team -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.alioth.debian.org/pipermail/nut-upsdev/attachments/20121012/901e6844/attachment.html>