Herwig Wittmann
2006-Dec-11 19:35 UTC
rsync /somedir work@backups::somearchive/ gets stuck in huge maildirs, rsync /somedir root@backups:/some/path/ works
hi, i hope i'm not reporting something well-known; i tried to understand the available bug tracking information. please excuse me if my problem report should not meet your standards, but i want to direct your attention to the following: http://koffein.org/av/rsync-bugreport/ problem report for rsync 2.6.9 on linux/IA32, 09 Dec 2006 Herwig Wittmann - Initials At Geizhals .at ******************************************************* "chrooted rsync gets stuck in huge maildirs, retval 12" ******************************************************* (i'm not sure wheter an chroot() mechanism is invoked in my case, or not, please see the mentioned URI which contains the used rsync.conf) *) it seems that rsync runs using the chroot mechanism get stuck reproduceably and time out with return value 12 in Maildirs with > 60 000 (mostly small) files (at least when the transfer runs over a line with limited bandwith) *) rsync runs started as superuser, without use of the chroot mechanism seem to work over the same line *) the 16 Mbit/s line used for the rsync transfer was monitored continuously during the transfer with a running "ping", there was no packet loss and > 95% of the round trip times were < 20ms *) + rsync 2.6.9 on linux/IA32 was used; + i have read http://rsync.samba.org/ftp/unpacked/rsync/NEWS + there should have been plenty of free disk space on both hosts + continuous monitoring did not show any network errors + i do not want to supply the generated core dump for security reasons, as there might be information disclosure problems debug run, all mentioned files can be found in http://koffein.org/av/rsync-bugreport/ ********************************************** 1) at 15:39, rsync was started as a daemon on host "backup": backup:~# date Sat Dec 9 15:39:45 CET 2006 backup:~# strace -ff -o rsyncdaemon.strace rsync --daemon --no-detach \ --port 5678 --config /etc/rsyncd.conf 2>rsync.out this later generated the following files on rsync daemon host "backup": -rw-r--r-- 1 root root 5596 Dec 9 15:40 rsyncdaemon.strace -rw-r--r-- 1 root root 26039248 Dec 9 15:43 rsyncdaemon.strace.13629 -rw-r--r-- 1 root root 960396 Dec 9 15:43 rsyncdaemon.strace.13682 2) at 15:40, i started the rsync client on rsyncclient host "morework": strace -ff -o rsyncclient.strace rsync --rsync-path=/root/rsync-debug \ --port 5678\ --timeout=600 -avz --numeric-ids --delete \ --password-file=/etc/rsyncd.work \ /home/xxy work@backups::work/home/xxy/ \ 2> /root/rsyncdebug.err >> /root/rsyncdebug.out this later generated the following files on rsyncclient host "morework": -rw-r--r-- 1 root root 70955672 Dec 9 15:46 rsyncclient.strace -rw-r--r-- 1 root root 256 Dec 9 15:46 rsyncdebug.err -rw-r--r-- 1 root root 34200 Dec 9 15:42 rsyncdebug.out 3) at 15:42, rsync transferred the last file, then both sides start to wait indefinitely in select() system calls. i waited for two more minutes and then 4) at 15:44, i invoked on the client host: morework:~# netstat > netstat-clienthost.txt 5) at 15:44, i invoked on the daemon host: backup:~# netstat > netstat-daemonhost.txt backup:~# kill -SEGV 13682; mv core core.13682 mv: cannot stat `core': No such file or directory # okay, seems this did not generate a core dump for some reason backup:~# kill -SEGV 13629 # no core dump either backup:~# killall -SEGV rsync # <<-- but this generated a core file: -rw------- 1 root root 364544 Dec 9 15:54 core Best regards, Herwig
Herwig Wittmann
2006-Dec-15 12:03 UTC
rsync /somedir work@backups::somearchive/ gets stuck in huge maildirs, rsync /somedir root@backups:/some/path/ works
Hello again, Herwig Wittmann wrote:> i hope i'm not reporting something well-known; i tried to > understand the available bug tracking information. please excuse > me if my problem report should not meet your standards, but i > want to direct your attention to the following: > > http://koffein.org/av/rsync-bugreport/I'm not sure whether it is okay to open this as a bugzilla bug or not, would that be okay? It seems I can reproduce it with any huge Maildir directory folder.> ******************************************************* > "chrooted rsync gets stuck in huge maildirs, retval 12" > *******************************************************[...] Kind regards, Herwig Wittmann
Matt McCutchen
2006-Dec-17 17:54 UTC
rsync /somedir work@backups::somearchive/ gets stuck in huge maildirs, rsync /somedir root@backups:/some/path/ works
On line 735 of rsyncdaemon.strace.13682, the receiver is waiting for more data to arrive from the sender. Meanwhile, on line 1829 of rsyncclient.strace, the outgoing buffer on the sender's socket has filled up and the sender is waiting until there is room for more data to be sent to the receiver. Clearly the network is at fault for not passing the data waiting in the sender's outgoing buffer on to the receiver. Matt
Matt McCutchen
2006-Dec-18 22:21 UTC
rsync /somedir work@backups::somearchive/ gets stuck in huge maildirs, rsync /somedir root@backups:/some/path/ works
I believe you that rsync over SSH works but an rsync daemon hangs; still, the hanging in the case of the daemon seems to be the fault of the network. Possibly the network especially likes port 22 or especially doesn't like port 873. You could run the SSH daemon or the rsync daemon on different ports or run a single-use rsync daemon over SSH and note which combinations succeed and which hang; that might provide more hints about what is causing the hanging. I'm forwarding your message to the rsync list in case anyone else has ideas. Matt On 12/18/06, Herwig Wittmann <hw@geizhals.at> wrote:> Hi Matt, > > sorry, this is long. > > Matt McCutchen wrote: > > On line 735 of rsyncdaemon.strace.13682, the receiver is waiting for > > more data to arrive from the sender. Meanwhile, on line 1829 of > > rsyncclient.strace, the outgoing buffer on the sender's socket has > > filled up and the sender is waiting until there is room for more data > > to be sent to the receiver. > > thank you for reading my straces! > > > Clearly the network is at fault for not > > passing the data waiting in the sender's outgoing buffer on to the > > receiver. > > i really apologize for trying to protest, > > and i just hope i'm not leading you on a wrong track, but i really try > to do everything i can to find any misconfiguration or network > problems > > and i've been using rsync since ~2000 (ok, that admittedly does not say > much about any qualifications :P) > > "but" :) > > to me, it seems to be always the same pattern- > rsync with double dot syntax repeatedly gets stuck > (please see the included backup log file excerpts if you can take the > time), and a single run as root always completes the synchronization > and thereby clears the problem. > > > > 1) rsync will reproduceably hang in yet unsynchronized huge maildirs by > invoking: rsync /somedir work@backups::somearchive/# > while the other invocation will always work and synchronize: > rsync /somedir root@backups:/some/path/ > > (tried more than 30 times) > > > 2) there is various other (lower volume) tcp traffic happening on the > mentioned 16 mbit line, which is monitored by nagios, munin and smokeping. > none of those tools showed any indications of trouble, and i mentioned > that i left a ping running during the hanging transfers as well, > which did not show any packet loss or unusally high round trip times. > ********************************************************************* > > > > i try to supply more (weak) evidence of my claims: > > > the following shows attempts of a nightly cron job to synchronize a few > directories using the rsync double dot syntax, the exact command is: > > rsync --timeout=600 --port=873 -avz --numeric-ids --delete \ > --password-file=/etc/rsyncd.work /home/archiver \ > work@backups::work/home/archiver 2>> /root/backup.err >> > /root/backup.out > > > please note that the /home/archiver directory contains a huge number of > files not yet synchronized to the backup storage host running the rsynd > daemon, and that that rsync run always times out with return value 30. > > > the exact wording of the error message is: > --- snip --- > io timeout after 608 seconds -- exiting > rsync error: timeout in data send/receive (code 30) at io.c(165) [sender=2.6.9] > --- snap --- > > > > a backup log file of the last days, showing that > the not yet manually synchronized /home/archiver/, which contains > a very huge Maildir always gets "stuck": > > 15.12.06 05:10:02 [3980] STARTING BACKUP > 15.12.06 05:10:47 [3980] /home/heidi status: 0 > ... > 15.12.06 05:25:56 [3980] /home/big23 status: 0 > 15.12.06 05:41:31 [3980] /home/archiver status: 30 > > 16.12.06 05:10:01 [11757] STARTING BACKUP > 16.12.06 05:10:46 [11757] /home/heidi status: 0 > ... > 16.12.06 05:26:10 [11757] /home/big23 status: 0 > 16.12.06 05:40:43 [11757] /home/archiver status: 30 > > 17.12.06 05:10:01 [21980] STARTING BACKUP > 17.12.06 05:10:23 [21980] /home/heidi status: 0 > ... > 17.12.06 05:22:33 [21980] /home/big23 status: 0 > 17.12.06 05:38:05 [21980] /home/archiver status: 30 > > 18.12.06 05:10:01 [10205] STARTING BACKUP > 18.12.06 05:10:28 [10205] /home/heidi status: 0 > ... > 18.12.06 05:23:13 [10205] /home/big23 status: 0 > 18.12.06 05:37:04 [10205] /home/archiver status: 30 > > just right now, a single run using > rsync --rsync-path=/root/rsync-debug \ > --timeout=600 -avz --numeric-ids --delete \ > --password-file=/etc/rsyncd.work \ > /home/archiver root@backups:/storage/mirror/work/home/archiver/ \ > 2> /root/rsyncdebug.err > /root/rsyncdebug.out > > synchronized the directory without hanging. > > > > greets, > herwig >
Apparently Analagous Threads
- [Bug 13423] New: Checksum option does not work as expected when append-verify is used
- Update: WARNING: --rsh or -e option ignored when connecting to rsyncdaemon]
- Cygwin bug in hosts allow
- opendir(somedir/somefile): Not enough space -- why?
- [LLVMdev] Issues with test framework as seen from OpenBSD buildslave