On Wed, 2016-10-12 at 13:30 +1300, Henri Shustak wrote:> Have you tried performing a copy to a known good local device? If a > local copy fails, then I would start checking the file system of the > source and also the hardware of that system.That's a good idea. I just tried that and it copied no problem. -- Kip Warner -- Senior Software Engineer OpenPGP encrypted/signed mail preferred http://www.thevertigo.com -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 163 bytes Desc: This is a digitally signed message part URL: <http://lists.samba.org/pipermail/rsync/attachments/20161012/dae67bc9/signature.sig>
>> Have you tried performing a copy to a known good local device? If a >> local copy fails, then I would start checking the file system of the >> source and also the hardware of that system. > > That's a good idea. I just tried that and it copied no problem.Do you have another system you could try this transfer with via SSH with on the local network? Given that the local transfer works fine, I would suggest checking the hardware and file system integrity on the remote machine. In terms of hardware checking the memory and disks would be a top priority. Also, you could try moving the partial file out of the way and also not using the partial option and transferring again? Hope that helps -------------------------------------------------------------------- HTRAX 2013 Revitalised : EGYPTIAN HUMP HTRAX : Direct URL download : http://henri.shustak.org/download/htrax/egyptian-hump-htrax.mp3 "Dr Who Meets B52's" - More reviews : http://www.jessetaylor.com.au
Try the transfer without -z. Paul
On 18.10.2016 07:03, Kip Warner wrote:> From what I can tell, there are no hardware problems. I also ran fsck > on the drive. The machine seems to be fine.I can confirm the problem. Situation here: 2 identical HP Microservers (Debian 7, on site compiled rsync 3.1.2, connected via OpenVPN). SSH is used for transport. Both machines have the correct date/time set via ntpd. All files on Client/Server are rw and have the right owner and are copy'able. oth sides. The "directory to backup" is a Samba-share (I stopped nmbd and smbd, no change). Client: 200GB, 42000 files total. Enough disk-space and memory on both sides. All rsync instances were killed (Client/Server) before starting rsync. tcpdump shows me a NOP packet every 2 min. I can provoke the error doing this: 1) Start the transfer (rsync scans *all* client files and starts sending a file) 2) ^C rsync on client 3) "pkill rsync" on server until all rsync-processes are killed. Same on client (just to be sure) 4) Start the transfer again, now rsync scans the top directories only and hangs (see straces below). Commandline: ./rsync-debug -v --archive --progress --human-readable --delete-during \ --rsync-path=/home/backup-hugo/bin/rsync-debug \ /srv/backup-bernd backup-hugo at backup-hugo.vpn:/srv/ Client says (PID 5909 = rsync, 5910 = ssh) ------------------------------------------------ [...] 5910 10:13:50 select(7, [3 4], [3], NULL, {240, 0}) = 1 (in [4], left {239, 999990}) 5910 10:13:50 read(4, "2010_20120119093643.pdf\0\3740O [...] \242}\30:V0124160__Nr.036_vom_10.09.2010_2012011"..., 16384) = 3072 loop: 5910 10:13:50 select(7, [3 4], [3], NULL, {240, 0} <unfinished ...> 5909 10:14:51 <... select resumed> ) = 0 (Timeout) 5909 10:14:51 select(6, [5], [], [5], {60, 0}) = 0 (Timeout) 5909 10:15:51 select(6, [5], [], [5], {60, 0}) = 0 (Timeout) 5909 10:16:51 select(6, [5], [], [5], {60, 0} <unfinished ...> 5910 10:17:51 <... select resumed> ) = 0 (Timeout) goto loop ------------------------------------------------ Server says (PID 10331 = rsync --server, 10332 = ssh) ------------------------------------------------ [...] 10331 10:13:50 lstat("backup-bernd/Schreibtisch", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 10331 10:13:50 lstat("backup-bernd/VirtualBox", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 10331 10:13:50 lstat("backup-bernd/bin", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 10331 10:13:50 lstat("backup-bernd/projekte", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 10331 10:13:50 lstat("backup-bernd/transfer", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 10331 10:13:50 select(4, [3], [1], [3], {60, 0}) = 1 (out [1], left {59, 999991}) 10331 10:13:50 write(1, "\4\0\0\7\3\20\0\0", 8) = 8 10331 10:13:50 select(4, [3], [], [3], {60, 0} <unfinished ...> 10332 10:14:50 <... select resumed> ) = 0 (Timeout) loop: 10332 10:14:50 select(1, [0], [], [0], {60, 0} <unfinished ...> 10331 10:14:50 <... select resumed> ) = 0 (Timeout) 10331 10:14:50 select(4, [3], [], [3], {60, 0} <unfinished ...> 10332 10:15:50 <... select resumed> ) = 0 (Timeout) goto loop ------------------------------------------------ -- Bernd Hohmann Organisationsprogrammierer Höhenstrasse 2 * 61130 Nidderau Telefon: 06187/900495 * Telefax: 06187/900493 Blog: http://blog.harddiskcafe.de
what does lsof tell? does rsync hang on a specific file? i would wonder if this is a rsync problem. as you told you killed all processes. so, on the second run rsync knows nothing from before... roland Am 18. Oktober 2016 12:08:00 MESZ, schrieb Bernd Hohmann <hohmann at harddiskcafe.de>:>On 18.10.2016 07:03, Kip Warner wrote: > >> From what I can tell, there are no hardware problems. I also ran fsck >> on the drive. The machine seems to be fine. > >I can confirm the problem. > >Situation here: 2 identical HP Microservers (Debian 7, on site compiled >rsync 3.1.2, connected via OpenVPN). > >SSH is used for transport. > >Both machines have the correct date/time set via ntpd. > >All files on Client/Server are rw and have the right owner and are >copy'able. oth sides. > >The "directory to backup" is a Samba-share (I stopped nmbd and smbd, no >change). Client: 200GB, 42000 files total. Enough disk-space and memory >on both sides. > >All rsync instances were killed (Client/Server) before starting rsync. > >tcpdump shows me a NOP packet every 2 min. > > >I can provoke the error doing this: > >1) Start the transfer (rsync scans *all* client files and starts >sending >a file) > >2) ^C rsync on client > >3) "pkill rsync" on server until all rsync-processes are killed. Same >on >client (just to be sure) > >4) Start the transfer again, now rsync scans the top directories only >and hangs (see straces below). > > >Commandline: > >./rsync-debug -v --archive --progress > --human-readable --delete-during \ > --rsync-path=/home/backup-hugo/bin/rsync-debug \ > /srv/backup-bernd backup-hugo at backup-hugo.vpn:/srv/ > > >Client says (PID 5909 = rsync, 5910 = ssh) >------------------------------------------------ >[...] >5910 10:13:50 select(7, [3 4], [3], NULL, {240, 0}) = 1 (in [4], left >{239, 999990}) >5910 10:13:50 read(4, "2010_20120119093643.pdf\0\3740O >[...] >\242}\30:V0124160__Nr.036_vom_10.09.2010_2012011"..., 16384) = 3072 > >loop: >5910 10:13:50 select(7, [3 4], [3], NULL, {240, 0} <unfinished ...> >5909 10:14:51 <... select resumed> ) = 0 (Timeout) >5909 10:14:51 select(6, [5], [], [5], {60, 0}) = 0 (Timeout) >5909 10:15:51 select(6, [5], [], [5], {60, 0}) = 0 (Timeout) >5909 10:16:51 select(6, [5], [], [5], {60, 0} <unfinished ...> >5910 10:17:51 <... select resumed> ) = 0 (Timeout) >goto loop >------------------------------------------------ > >Server says (PID 10331 = rsync --server, 10332 = ssh) >------------------------------------------------ >[...] >10331 10:13:50 lstat("backup-bernd/Schreibtisch", >{st_mode=S_IFDIR|0755, >st_size=4096, ...}) = 0 >10331 10:13:50 lstat("backup-bernd/VirtualBox", {st_mode=S_IFDIR|0755, >st_size=4096, ...}) = 0 >10331 10:13:50 lstat("backup-bernd/bin", {st_mode=S_IFDIR|0755, >st_size=4096, ...}) = 0 >10331 10:13:50 lstat("backup-bernd/projekte", {st_mode=S_IFDIR|0755, >st_size=4096, ...}) = 0 >10331 10:13:50 lstat("backup-bernd/transfer", {st_mode=S_IFDIR|0755, >st_size=4096, ...}) = 0 >10331 10:13:50 select(4, [3], [1], [3], {60, 0}) = 1 (out [1], left >{59, >999991}) >10331 10:13:50 write(1, "\4\0\0\7\3\20\0\0", 8) = 8 >10331 10:13:50 select(4, [3], [], [3], {60, 0} <unfinished ...> >10332 10:14:50 <... select resumed> ) = 0 (Timeout) > >loop: >10332 10:14:50 select(1, [0], [], [0], {60, 0} <unfinished ...> >10331 10:14:50 <... select resumed> ) = 0 (Timeout) >10331 10:14:50 select(4, [3], [], [3], {60, 0} <unfinished ...> >10332 10:15:50 <... select resumed> ) = 0 (Timeout) >goto loop >------------------------------------------------ > >-- >Bernd Hohmann >Organisationsprogrammierer >Höhenstrasse 2 * 61130 Nidderau >Telefon: 06187/900495 * Telefax: 06187/900493 >Blog: http://blog.harddiskcafe.de > > > >-- >Please use reply-all for most replies to avoid omitting the mailing >list. >To unsubscribe or change options: >https://lists.samba.org/mailman/listinfo/rsync >Before posting, read: >http://www.catb.org/~esr/faqs/smart-questions.html-- Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20161019/af9a31ed/attachment.html>
On Tue, 2016-10-18 at 08:36 +0200, Paul Slootman wrote:> Try the transfer without -z. > > PaulI ended up giving up. What I did was I just removed the 30GB file (which I really didn't need anyways) and the transfer carried on without a hitch. -- Kip Warner -- Senior Software Engineer OpenPGP encrypted/signed mail preferred http://www.thevertigo.com -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 163 bytes Desc: This is a digitally signed message part URL: <http://lists.samba.org/pipermail/rsync/attachments/20161019/bddbdaa1/signature.sig>