samba-bugs at samba.org
2010-May-19 12:31 UTC
DO NOT REPLY [Bug 7438] New: rsync timeouts and stays in state D
https://bugzilla.samba.org/show_bug.cgi?id=7438
Summary: rsync timeouts and stays in state D
Product: rsync
Version: 3.0.7
Platform: Other
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P3
Component: core
AssignedTo: wayned at samba.org
ReportedBy: raphink at gmail.com
QAContact: rsync-qa at samba.org
I'm syncing a rather big list of small files using rsync. My client is rsync
3.0.7-1ubuntu1 running on Ubuntu Lucid, and the server is a backport for rsync
3.0.0-1 for Debian Etch (although I had the same problem with rsync 2.6.9 as
server, so it seems to be a problem on the client side).
Some time in the middle of the sync, the client freezes. When I kill it (using
ctrl+C or kill), the client returns with an error "undefined error"
but the
process keeps running in state D (as seen in ps), so it cannot be killed. The
servers returns, too, and is not seen anymore in the process list.
Here are the last lines of the strace tracks on both the client and the server:
client:
================lstat("archives/private/sun-exploit-build/2009-August/date.html",
0x7fff4c604ec0) = -1 ENOENT (No such file or directory)
lstat("archives/private/sun-exploit-build/2009-August/index.html",
{st_mode=S_IFLNK|0777, st_size=11, ...}) = 0
lstat("archives/private/sun-exploit-build/2009-August/subject.html",
0x7fff4c604ec0) = -1 ENOENT (No such file or directory)
lstat("archives/private/sun-exploit-build/2009-August/thread.html",
0x7fff4c604ec0) = -1 ENOENT (No such file or directory)
select(5, [3], [4], [4], {60, 0}) = 1 (out [4], left {59, 999997})
write(4,
"\37\3\0\7\0\1\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\240\0\0\0\0\0"...,
803) = 803
gettimeofday({1274270586, 245243}, NULL) = 0
select(4, [3], [], NULL, {60, 0}) = 0 (Timeout)
select(4, [3], [], NULL, {60, 0}) = 0 (Timeout)
select(4, [3], [], NULL, {60, 0}) = 0 (Timeout)
select(4, [3], [], NULL, {60, 0}) = 0 (Timeout)
select(4, [3], [], NULL, {60, 0}) = 0 (Timeout)
select(4, [3], [], NULL, {60, 0}) = 0 (Timeout)
select(4, [3], [], NULL, {60, 0}) = 0 (Timeout)
select(4, [3], [], NULL, {60, 0}) = 0 (Timeout)
select(4, [3], [], NULL, {60, 0}) = 0 (Timeout)
select(4, [3], [], NULL, {60, 0}^C <unfinished ...>
================
server:
================select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {60, 0})
write(1, "\374\17\0\7", 4) = 4
select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {60, 0})
write(1, "\"> (Priorité supervision:"..., 4092) =
4092
select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {60, 0})
write(1, "\374\17\0\7", 4) = 4
select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {60, 0})
write(1, "AHx8fA6H3JUKXLFmiAGzelQDAw527\nAl"..., 4092) = 4092
select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {60, 0})
write(1, "\374\17\0\7", 4) = 4
select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {60, 0})
write(1, "itOSosZqrN5to9mRnA\nOGBL3x59BrQUd"..., 4092) = 4092
select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {60, 0})
write(1, "\374\17\0\7", 4) = 4
select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {60, 0})
write(1, "eric at lists.sourceforge.x-echo.co"..., 4092) = 4092
select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {60, 0})
write(1, "\374\17\0\7", 4) = 4
select(2, NULL, [1], [1], {60, 0}) = 0 (Timeout)
select(2, NULL, [1], [1], {60, 0}) = 0 (Timeout)
select(2, NULL, [1], [1], {60, 0}) = 0 (Timeout)
select(2, NULL, [1], [1], {60, 0}) = 0 (Timeout)
select(2, NULL, [1], [1], {60, 0}) = 0 (Timeout)
select(2, NULL, [1], [1], {60, 0}) = 0 (Timeout)
select(2, NULL, [1], [1], {60, 0}) = 0 (Timeout)
select(2, NULL, [1], [1], {60, 0}) = 0 (Timeout)
select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {51, 216000})
write(1, ".131.30])\n\tby gwprod01m1.infra.m"..., 4092) = -1 EPIPE
(Broken
pipe)
--- SIGPIPE (Broken pipe) @ 0 (0) ---
write(2, "rsync: writefd_unbuffered failed"..., 79) = -1 EPIPE (Broken
pipe)
--- SIGPIPE (Broken pipe) @ 0 (0) ---
rt_sigaction(SIGUSR1, {SIG_IGN}, NULL, 8) = 0
rt_sigaction(SIGUSR2, {SIG_IGN}, NULL, 8) = 0
write(2, "rsync error: errors with program"..., 83) = -1 EPIPE (Broken
pipe)
--- SIGPIPE (Broken pipe) @ 0 (0) ---
rt_sigaction(SIGUSR1, {SIG_IGN}, NULL, 8) = 0
rt_sigaction(SIGUSR2, {SIG_IGN}, NULL, 8) = 0
gettimeofday({1274271146, 160025}, NULL) = 0
select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
gettimeofday({1274271146, 259470}, NULL) = 0
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
gettimeofday({1274271146, 263457}, NULL) = 0
exit_group(13) = ?
Process 20281 detached
================
I can't find where this problem comes from, but I get it every time I
relaunch
rsync.
--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.
samba-bugs at samba.org
2010-Jun-19 16:54 UTC
DO NOT REPLY [Bug 7438] rsync timeouts and stays in state D
https://bugzilla.samba.org/show_bug.cgi?id=7438
wayned at samba.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |WORKSFORME
------- Comment #1 from wayned at samba.org 2010-06-19 11:54 CST -------
A non-killable process typically means that it is stuck in the kernel, and
can't be killed until it comes out again. That is also likely to be the
cause
of the hang -- it is trying to perform an operation that is hanging in the
kernel.
You should check the system calls that rsync is doing, perhaps by running it
under an "strace" or "truss" type of program. That can help
you to find out
what is triggering the problem.
--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.