Friends -- I am posting this to both lists since I think it has to do with some kind of unfortunate interaction. The latest rsync (3.0.7-1) under an up-to-date cygwin on Windows 7 x64 gets into some kind of busy wait situation when transferring large files over ssh. rsync, ssh, and zip can all be consuming much cpu time. I downloaded and built rsync 3.0.7 locally, manually editing config.status to turn off HAVE_SOCKETPAIR. The resulting rsync works fine. So, what specific diagnosis and further information would you find most helpful? A few months ago, Corinna and I went through a whole round of trying to get socketpair to work better, and she developed a BLODA work-around. At her request, I have kept the particular offending BLODA installed. AFAIK, the original problem (complete inability to open a socketpair and fork properly) is gone. This is a different problem, but seems to have to do with not being able to push data through socket pairs or detect presence of more data, etc. Regards -- Eliot Moss
As I reported yesterday, I had noticed a problem with latest rsync (3.0.7-1) under latest cygwin on Windows 7 x64. Sometimes rsync of a large file over compressed ssh channels goes into a busy wait with no progress, consuming all the cpu. Now I have better evidence that points to something in the cygwin socketpair implementation: I built rsync 3.0.7 from source locally, and found that the "out of the box" version, which has use of socketpair enabled, exhibits the hanging. If I turn off HAVE_SOCKETPAIR in config.status (by manually editing that line), then the resulting rsync works ok. My guess is that this has something to do with detecting whether data is available on one of the sockets. It appears that a chain of processes gets built with pipes/sockets between them for rsync, ssh, and zip. Given that this occurs only sometimes and apparently only on rsync transfers related to large files (but their contents may not have changed much) I am at a loss to come up with a simple test case. Perhaps the rsync implementers can confirm that the only coding difference in the presence of HAVE_SOCKETPAIR is in the construction of communication channels between processes, and that those file descriptors are otherwise used in the same way, with or without socketpair. And perhaps also indicate the style in which rsync uses the sockets? In any case, I take the current evidence as leaning more towards there being a bug in cygwin's socketpair / socket implementation. Again, what further information do you want? Corinna, IIRC, I still have that particular BLODA installed that you wanted me to have to test about the previous issue with socketpair, fixed back in November .... Regards -- Eliot Moss
On Tue, Feb 23, 2010 at 12:55 PM, Eliot Moss <moss at cs.umass.edu> wrote:> > The latest rsync (3.0.7-1) under an up-to-date cygwinon Windows 7 x64 gets into some kind of busy wait situation when transferring large files over ssh. rsync, ssh, and zip can all be consuming much cpu time. [...] I downloaded and built rsync 3.0.7 locally, manually> editing config.status to turn off HAVE_SOCKETPAIR. > The resulting rsync works fine. >I'd imagine that both ssh and rsync start using a lot of CPU because the socketpair must be indicating that it is ready for a write (or read) but the actual write() (or read()) fails to return any bytes (as long as errno is something like EAGAIN, EINTR, or EWOULDBLOCK, rsync will try again). If you want to test that theory, you could add some prints to rsync's io.c file near the 3 uses of EWOULDBLOCK and have it output what errno it gets. If you get that fixed, the programs that interface with a socketpair should go back to normal. ..wayne.. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20100306/e1e25eb9/attachment.html>