You shouldn't need to have such a long timeout. The timeout is not over
the whole length of the run, only the time since the last data was
transferred. It's a mystery to me why it quits after 66 minutes rather
than 5 hours, but the real question is why it stops transferring data for
so long. Perhaps something went wrong with the network. I can't connect
to that server to try it, perhaps it is behind a firewall.
- Dave Dykstra
On Tue, Apr 16, 2002 at 12:36:03PM -0400, Alberto Accomazzi
wrote:>
> Dear all,
>
> I've been trying to track down a problem with timeouts when pulling
data from
> an rsync daemon and I have now run out of any useful ideas.
> The problem manifests itself when I try to transfer a large directory tree
> on a slow client machine. What happens then is that the client rsync
process
> successfully receives the list of files from the server, then begins
checking
> the local directory tree, taking its sweet time. Since I know that the
process
> is quite slow, I invoke rsync with a timeout of 5 hours to avoid dropping
the
> connection. Howerver, after a little over 1 hour (usually 66 minutes or
so),
> the server process simply gives up.
>
> I have verified the problem under rsync versions 2.3.2, and 2.4.6 and up
> (including 2.5.5), testing a few different combinations of client/server
> versions (althoug the client is always a linux box and the server always
> a solaris box). It looks to me as if something kicks the server out of
> the select() call at line 202 of io.c (read_timeout) despite the timeout
> being correctly set to 18000 seconds. Can anybody think of what the
> problem may be? See all the details below.
>
> Thanks,
>
> -- Alberto
>
>
>
> CLIENT:
>
> [ads@ads-pc ~]$ rsync --version
> rsync version 2.5.5 protocol version 26
> Copyright (C) 1996-2002 by Andrew Tridgell and others
> <http://rsync.samba.org/>
> Capabilities: 64-bit files, socketpairs, hard links, symlinks, batchfiles,
> IPv6, 64-bit system inums, 64-bit internal inums
>
> rsync comes with ABSOLUTELY NO WARRANTY. This is free software, and you
> are welcome to redistribute it under certain conditions. See the GNU
> General Public Licence for details.
>
> [ads@ads-pc ~]$ rsync -ptv --compress --suffix .old --timeout 18000 -r
--delete rsync://adsfore.harvard.edu:1873/text-4097/.
/mnt/fwhd0/abstracts/phy/text/
> receiving file list ... done
> rsync: read error: Connection reset by peer
> rsync error: error in rsync protocol data stream (code 12) at io.c(162)
> rsync: connection unexpectedly closed (17798963 bytes read so far)
> rsync error: error in rsync protocol data stream (code 12) at io.c(150)
>
>
> SERVER:
>
> adsfore-15: /proj/ads/soft/utils/src/rsync-2.5.5/rsync --version
> rsync version 2.5.5 protocol version 26
> Copyright (C) 1996-2002 by Andrew Tridgell and others
> <http://rsync.samba.org/>
> Capabilities: 64-bit files, socketpairs, hard links, symlinks, batchfiles,
> no IPv6, 64-bit system inums, 64-bit internal inums
>
> rsync comes with ABSOLUTELY NO WARRANTY. This is free software, and you
> are welcome to redistribute it under certain conditions. See the GNU
> General Public Licence for details.
>
> from the log file:
>
> 2002/04/16 08:52:48 [18996] rsyncd version 2.5.5 starting, listening on
port 1873
> 2002/04/16 09:39:01 [988] rsync on text-4097/. from ads-pc (131.142.43.117)
> 2002/04/16 10:51:36 [988] rsync: read error: Connection timed out
> 2002/04/16 10:51:36 [988] rsync error: error in rsync protocol data stream
(code 12) at io.c(162)
>
> from a truss:
>
> adsfore-14: truss -d -p 988
> Base time stamp: 1018964639.2848 [ Tue Apr 16 09:43:59 EDT 2002 ]
> poll(0xFFBE4E90, 1, 18000000) (sleeping...)
> 4057.4093 poll(0xFFBE4E90, 1, 18000000) = 1
> 4057.4098 read(3, 0xFFBE5500, 4) Err#145
ETIMEDOUT
> 4057.4103 time() =
1018968696
> 4057.4106 getpid() = 988
[18996]
> 4057.4229 write(4, " 2 0 0 2 / 0 4 / 1 6 1".., 66) =
66
> 4057.4345 sigaction(SIGUSR1, 0xFFBE4D20, 0xFFBE4DA0) = 0
> 4057.4347 sigaction(SIGUSR2, 0xFFBE4D20, 0xFFBE4DA0) = 0
> 4057.4349 time() =
1018968696
> 4057.4350 getpid() = 988
[18996]
> 4057.4352 write(4, " 2 0 0 2 / 0 4 / 1 6 1".., 98) =
98
> 4057.4357 llseek(0, 0, SEEK_CUR) = 0
> 4057.4359 _exit(12)
>
>
>
****************************************************************************
> Alberto Accomazzi
mailto:aaccomazzi@cfa.harvard.edu
> NASA Astrophysics Data System
http://adsabs.harvard.edu
> Harvard-Smithsonian Center for Astrophysics
http://cfawww.harvard.edu
> 60 Garden Street, MS 83, Cambridge, MA 02138 USA
>
****************************************************************************
>
> --
> To unsubscribe or change options:
http://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html