Thanks in advance for taking the time to read this email. I'm using rsync to make a copy of my Netbackup Catalogs to an offsite server for DRP. I have narrowed the problem down to a few sub-directories that contains 16 directories each and in each directory there are about 37 files. None of the files are larger than 300MB. In fact, most of them are around 1K-5MB. The problem is, rsync will hang with no output to the offsite server within 15 seconds of issuing this command inside of a /bin/sh script. The script is running as root. /usr/local/bin/rsync -avz --bwlimit=3096 --stats --timeout=600 --delete-after --no-detach --numeric-ids --rsync-path=/usr/local/bin/rsync /usr/openv/netbackup/db/images/cave/ drphost.ti.com:/export/d2/nbumaster/ openv/netbackup/db/images/cave/ > /tmp/log 2>&1 When I use rsync to sync other image directories, I don't have any problems. I have taken the catalogs off-line (un-mounted) and ran fsck on the RAID with no problems. So I don't believe it is an inode problem. Since I put the --timeout=600, rsync will now die, but before I did this, rsync would hang for days until I killed it, on both servers. Have any of you seen this problem before? Do you have any suggestions that could help me debug further? Would using rsync as a daemon help? (instead of rsh) How do you setup rsync as a daemon? (rsync.conf ?) Here is the last output of a truss command showing how rsync hangs: ------------------------------------------ 22436: poll(0xFFBE6C78, 1, 600000) = 1 22436: read(7, " 2\0\0\t", 4) = 4 22436: time() = 1029194645 22436: poll(0xFFBE6C78, 1, 600000) = 1 22436: read(7, " 1 0 2 0 0 0 0 0 0 0 / C".., 50) = 50 22436: time() = 1029194645 22436: write(1, " 1 0 2 0 0 0 0 0 0 0 / C".., 50) = 50 22436: poll(0xFFBE6C78, 1, 600000) = 1 22436: read(7, " =\0\0\t", 4) = 4 22436: time() = 1029194645 22436: poll(0xFFBE6C78, 1, 600000) = 1 22436: read(7, " r e c v _ g e n e r a t".., 61) = 61 22436: time() = 1029194645 22436: write(1, " r e c v _ g e n e r a t".., 61) = 61 22436: poll(0xFFBE6C78, 1, 600000) (sleeping...) 22438: poll(0xFFBEF468, 2, -1) (sleeping...) ------------------------------------------ And it just keeps on sleeping........... Here is the output of a snoop command listening for drphost. ------------------------------------------ nbumaster -> drphost.ti.com RSHELL C port=1017 drphost.ti.com -> nbumaster RSHELL R port=1017 \27\0\0\trecv_generator(. nbumaster -> drphost.ti.com RSHELL C port=1017 drphost.ti.com -> nbumaster RSHELL R port=1017 6\0\0\t1019000000/Cave- nbumaster -> drphost.ti.com RSHELL C port=1017 drphost.ti.com -> nbumaster RSHELL R port=1017 9\0\0\trecv_generator(1 nbumaster -> drphost.ti.com RSHELL C port=1017 nbumaster -> drphost.ti.com RSHELL C port=1023 drphost.ti.com -> nbumaster RSHELL R port=1023 drphost.ti.com -> nbumaster RSHELL R port=1023 a nbumaster -> drphost.ti.com RSHELL C port=1023 ----------------------------------------- Why did it switch ports here? Is it suppose to? Here is the output of the last few lines of /tmp/log. ------------------------------- 1020000000/Cave-v2-v3_1020169327_FULL is uptodate recv_generator(1020000000/Cave-v2-v3_1020169327_FULL.f.Z,10) 1020000000/Cave-v2-v3_1020169327_FULL.f.Z is uptodate recv_generator(1020000000/Cave-v2-v3_1020651001_FULL,11) 1020000000/Cave-v2-v3_1020651001_FULL is uptodate recv_generator(1020000000/Cave-v2-v3_1020651001_FULL.f.Z,12) 1020000000/Cave-v2-v3_1020651001_FULL.f.Z is uptodate recv_generator(1020000000/Cave-v4-v5_1020737309_FULL,13) 1020000000/Cave-v4-v5_1020737309_FULL is uptodate recv_generator(1020000000/Cave-v4-v5_1020737309_FULL.f.Z,14) io timeout after 600 seconds - exiting rsync error: timeout in data send/receive (code 30) at io.c(85) _exit_cleanup(code=30, file=io.c, line=85): about to call exit(30) Thanks for any help you can give me. Regards John Stephens -- +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ + John D Stephens ITS Design Systems + + Texas Instruments 12500 TI BLVD, Dallas + + jstephens@ti.com 214-480-6229 + +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+