samba-bugs at samba.org
2011-Jan-09  20:54 UTC
DO NOT REPLY [Bug 7903] New: Cannot complete sync
https://bugzilla.samba.org/show_bug.cgi?id=7903
           Summary: Cannot complete sync
           Product: rsync
           Version: 3.0.7
          Platform: x64
        OS/Version: Linux
            Status: NEW
          Severity: major
          Priority: P3
         Component: core
        AssignedTo: wayned at samba.org
        ReportedBy: triune at gmail.com
         QAContact: rsync-qa at samba.org
I'm using rsync to mirror a 2TB archive from a LOCAL machine to a REMOTE
one.
The initial sync was performed over a 1gbit lan. Now, it is being performed
over the internet with 2 servers at 50mbits. I've tried everything mentioned
in
the FAQ and other places to remedy this problem to no avail. I've tweaked my
~/.ssh/config to keep it alive (client/server), and changed some of the rsync
settings to keep it active and not timeout. However, the problem persists.
I've
attempted to log the dbug info as mentioned in the FAQ. Below are my rsync
calls and 100 lines of debug output:
$ rsync -aivz --partial --delete-during --stats --progress
--rsync-path=/tmp/rsync-debug /volume/ -e ssh
user at host.org:/home/user/volume_backup ;
[pid  4195] select(1, [0], [], NULL, {60, 0}) = 1 (in [0], left {59, 999996})
[pid  4195] read(0, "3/\225\177\10d\233>\"C\27\260~\332\235`U
^\10H\312\3654+\223\277J\372m\213n"..., 4092) = 4092
[pid  4195] select(1, [0], [], NULL, {60, 0}) = 1 (in [0], left {59, 999995})
[pid  4195] read(0, "\374\17\0\7", 4)   = 4
[pid  4195] select(1, [0], [], NULL, {60, 0}) = 1 (in [0], left {59, 999996})
[pid  4195] read(0,
"\373\266\317\333!]\367\227U\345\27w\206l\307\246\3727\t\205\337;\207\366U\363\33\356{\375\244\273"...,
4092) = 4092
[pid  4195] select(1, [0], [], NULL, {60, 0}) = 1 (in [0], left {59, 999996})
[pid  4195] read(0, "\374\17\0\7", 4)   = 4
[pid  4195] select(1, [0], [], NULL, {60, 0}) = 1 (in [0], left {59, 999996})
[pid  4195] read(0,
"\276;\232\325\353C\300\315v\7Q8\25\v4\303\35\234\217\273\230\327U\373\245;\342\222\4\17\367\305"...,
4092) = 4092
[pid  4195] select(1, [0], [], NULL, {60, 0}) = 1 (in [0], left {59, 999996})
[pid  4195] read(0, "\374\17\0\7", 4)   = 4
[pid  4195] select(1, [0], [], NULL, {60, 0}) = 1 (in [0], left {59, 999996})
[pid  4195] read(0,
"TK\244\3\255\223\246d4\214\235\325r\216\214\241\203\347\331Q\232t@\333#)\f\25\7\30\230\r"...,
4092) = 4092
[pid  4195] select(1, [0], [], NULL, {60, 0}) = 1 (in [0], left {59, 999996})
[pid  4195] read(0, "", 4)              = 0
[pid  4195] select(5, NULL, [4], [4], {60, 0}) = 1 (out [4], left {59, 999994})
[pid  4195] write(4, "S\0\0\nrsync: connection unexpected"..., 87) =
87
[pid  4194] <... select resumed> )      = 1 (in [3], left {31, 468490})
[pid  4195] rt_sigaction(SIGUSR1, {SIG_IGN, [], SA_RESTORER, 0x7fac05ca1af0}, 
<unfinished ...>
[pid  4194] read(3,  <unfinished ...>
[pid  4195] <... rt_sigaction resumed> NULL, 8) = 0
[pid  4194] <... read resumed> "S\0\0\nrsync: connection
unexpected"..., 8184)
= 87
[pid  4195] rt_sigaction(SIGUSR2, {SIG_IGN, [], SA_RESTORER, 0x7fac05ca1af0}, 
<unfinished ...>
[pid  4194] select(4, [3], [1], [1], {60, 0} <unfinished ...>
[pid  4195] <... rt_sigaction resumed> NULL, 8) = 0
[pid  4194] <... select resumed> )      = 1 (out [1], left {59, 999993})
[pid  4195] close(1 <unfinished ...>
[pid  4194] write(1, "S\0\0\nrsync: connection unexpected"..., 87
<unfinished
...>
[pid  4195] <... close resumed> )       = 0
[pid  4194] <... write resumed> )       = -1 EPIPE (Broken pipe)
[pid  4195] write(3,
"\0Ar\264\f\3151\225\350G\v\365\207\203\253\224\37\t}y\255^q}\241\250\341\2Lq2\361"...,
57465 <unfinished ...>
[pid  4194] --- SIGPIPE (Broken pipe) @ 0 (0) ---
[pid  4194] write(2, "rsync: writefd_unbuffered failed"..., 90rsync:
writefd_unbuffered failed to write 87 bytes to socket [generator]: Broken pipe
(32)) = 90
[pid  4194] write(2, "\n", 1
)           = 1
[pid  4194] rt_sigaction(SIGUSR1, {SIG_IGN, [], SA_RESTORER, 0x7fac05ca1af0},
NULL, 8) = 0
[pid  4194] rt_sigaction(SIGUSR2, {SIG_IGN, [], SA_RESTORER, 0x7fac05ca1af0},
NULL, 8) = 0
[pid  4194] getpid()                    = 4194
[pid  4194] kill(4195, SIGUSR1)         = 0
[pid  4194] write(2, "rsync error: error in rsync prot"..., 90rsync
error:
error in rsync protocol data stream (code 12) at io.c(1530) [generator=3.0.7])
= 90
[pid  4194] write(2, "\n", 1
)           = 1
[pid  4194] gettimeofday({1294605317, 147297}, NULL) = 0
[pid  4194] select(0, NULL, NULL, NULL, {0, 100000} <unfinished ...>
[pid  4195] <... write resumed> )       = 57465
[pid  4195] --- SIGUSR1 (User defined signal 1) @ 0 (0) ---
[pid  4195] close(3)                    = 0
[pid  4195]
lstat("G/_/videos/.The.Good.The.Bad.And.The.Ugly.1966.EXTENDED.720p.mkv.rPfKnC",
{st_mode=S_IFREG|0700, st_size=1799413881, ...}) = 0
[pid  4195]
chmod("G/_/videos/.The.Good.The.Bad.And.The.Ugly.1966.EXTENDED.720p.mkv.rPfKnC",
0755) = 0
[pid  4195]
rename("G/_/videos/.The.Good.The.Bad.And.The.Ugly.1966.EXTENDED.720p.mkv.rPfKnC",
"G/_/videos/The.Good.The.Bad.And.The.Ugly.1966.EXTENDED.720p.mkv"
<unfinished
...>
[pid  4194] <... select resumed> )      = 0 (Timeout)
[pid  4194] gettimeofday({1294605317, 247806}, NULL) = 0
[pid  4194] exit_group(12)              = ?
<... rename resumed> )                  = 0
select(5, NULL, [4], [4], {60, 0})      = 1 (out [4], left {59, 999992})
write(4, "Y\0\0\nrsync error: error in rsync "..., 93) = -1 EPIPE
(Broken pipe)
--- SIGPIPE (Broken pipe) @ 0 (0) ---
rt_sigaction(SIGUSR1, {SIG_IGN, [], SA_RESTORER, 0x7fac05ca1af0}, NULL, 8) = 0
rt_sigaction(SIGUSR2, {SIG_IGN, [], SA_RESTORER, 0x7fac05ca1af0}, NULL, 8) = 0
gettimeofday({1294605317, 421295}, NULL) = 0
select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
gettimeofday({1294605317, 521602}, NULL) = 0
exit_group(12)                          = ?
Process 4195 detached
-- 
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.
samba-bugs at samba.org
2011-Jan-11  02:29 UTC
DO NOT REPLY [Bug 7903] Cannot complete sync
https://bugzilla.samba.org/show_bug.cgi?id=7903
triune at gmail.com changed:
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |triune at gmail.com
------- Comment #1 from triune at gmail.com  2011-01-10 20:29 CST -------
could rsync perhaps implement a new flag to ask it to retry a certain number of
times, or perhaps infinitely... never give up? caching the ssh login would
probably be requred for ssh connections.
-- 
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.
https://bugzilla.samba.org/show_bug.cgi?id=7903
--- Comment #2 from Wayne Davison <wayned at samba.org> 2011-06-04
16:21:24 UTC ---
Rsync doesn't support a reconnect because the protocol is stateful, and
currently only supports starting the transfer from the beginning again (which
can be done by the caller).
What you need to strace is the other side of the connection.  If both sides are
getting a broken connection error, then the connection went down and there is
nothing rsync can do about it.  If there was some kind of error on the other
side, then you can attempt to resolve that error (so that the remote rsync
doesn't go away).  See http://rsync.samba.org/issues.html and the
rsync-debug
script for a possible way to strace the remote side.
Finally, if your connection is just flaking out, you can try dividing up the
mirroring task into smaller chunks that can be sent by a series of rsync tasks.
 One extreme example (that is probably overkill) would be to rsync just a
single directory at a time by using -d instead of -r and still using --delete:
#!/usr/bin/perl
use strict;
my $SRC = '/tmp/src'; # set to '/volume'
my $DEST = '/tmp/dest'; # set to 'user at host.org:/path/...'
chdir $SRC or die "Unable to chdir to $SRC: $!";
$/ = "\0";
open PIPE, '-|', qw( find . -type d -print0 ) or die $!;
while (<PIPE>) {
    chomp;
    if (system qw( rsync -aidR --no-r --del ), "$_/", $DEST) {
        warn "Got error from rsync for directory $_\n";
    }
}
close PIPE;
-- 
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=7903 --- Comment #3 from .:triune. <triune at gmail.com> 2011-06-04 16:52:45 UTC --- My problem was solved when I swapped out the network cable between my box and my router. Apparently, one of the 8 pieces of copper in the CAT-5 were broken. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=7903
Wayne Davison <wayned at samba.org> changed:
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WORKSFORME
-- 
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Possibly Parallel Threads
- rsync 3.0.9 hangs on large file
- DO NOT REPLY [Bug 5695] New: rsync local timeout
- What does this mean? select(1, [0], , NULL, {60, 0}) = 0 (Timeout)
- DO NOT REPLY [Bug 4327] New: Sudden rsync errors
- What does this mean? select(1, [0], , NULL, {60, 0}) = 0 (Timeout)