Lenz Weber
2015-Jan-07 15:16 UTC
rsync splits filenames, creates special characters where none are, weird permissions
Hello, I have a quite unusual encoding problem (?). I call rsync with the following parameters: /usr/bin/rsync -a --delete --numeric-ids --delete-excluded \ --rsh="/usr/bin/ssh -o StrictHostKeyChecking=no -i \ /etc/rsnapshot_ssh_certs/mykey" \ --link-dest=/data/snapshots/hourly.1/folder/mail/ \ rsyncbackup at server:/var/backups/mail/. \ /data/snapshots/hourly.0/folder/mail/ Where the local destination /data/snapshots is an NFS volume mounted with the flags (rw,noatime,addr=192.168.1.XX) and the source is a symlink to a zfs snapshot - that looks like this: /var/backups/mail -> /tank/mail/.zfs/snapshot/zfs-auto-snap_hourly-2015-01-07-1417 as far as I can tell, both systems work with UTF8 just fine (source is Ubuntu 14.04 and target is Debian Lenny) Now there seems to be a problem while gathering or transferring the file list, as rsync tries to create files/folders that share a part with real files on the source, but with additional characters, sometimes cut off, without the preceding parent folder et cetera. The source file names in this case look like this: /var/backups/mail/redacted-domain/catchall/Maildir/.Sent/cur/1313508314.M654736P32713V0000000000000801I000B03CC_6.redacted-hostname,S=42352:2,S /var/backups/mail/redacted-domain/info/Maildir/.Sent/cur/1313508314.M654736P32713V0000000000000801I000B03CC_6.redacted-hostname,S=42352:2,S but rsync fails on files like this, that clearly do not exist: skipping non-regular file "2713V0000000000000801I000B03CC_6.redacted-hostname,S=423\#001\#305\#001O\#233\#240?" skipping non-regular file "2713V0000000000000801I000B03CC_6.redacted-hostname,S=42352:2,S" skipping non-regular file "83E13498714.M297793P23544V000" skipping non-regular file "? \#201" skipping non-regular file "redacted-domain/catchall/Maildir/.Sent/cur/1301490998.M622842P6671V0000000000000801I00280BD9_0.redacted-hostname\#004" skipping non-regular file "2713V0000000000000801I000B03CC_6.redacted-hostname,S=pedition/courierimapkeywords/:list" while skipping/failing is still the good part, it even creates folders and files with names like dr--rw--wt 2 48 49 4.0K Apr 9 1970 00B03C3_0.redacted-hostname,S=559475:2,S?s?NffJ?? c--SrwS-w- 1 root 66 48, 37 Aug 22 1995 317028.M727693P4967V0000000000000801I000C23B2_0.redacted-hostname,S dr-Srw---x 2 48 staff 4.0K Aug 15 1995 6683671.M93103P25845V0000000000000801I002E40C9_0. take a look at the garbled up file permissions, not to forget that these files are created in the target directory root instead of a subdirectory. I have been using rsync happily for years, but after adding this new source server, nothing seems to work. Is this a bug in combination with zfs? Is this known? Is there a workaround? Please help me, Lenz
Paul Slootman
2015-Jan-07 17:25 UTC
rsync splits filenames, creates special characters where none are, weird permissions
On Wed 07 Jan 2015, Lenz Weber wrote:> Where the local destination /data/snapshots is an NFS volume mounted with the flags > (rw,noatime,addr=192.168.1.XX) > and the source is a symlink to a zfs snapshot - that looks like this: > /var/backups/mail -> /tank/mail/.zfs/snapshot/zfs-auto-snap_hourly-2015-01-07-1417Why not skip the NFS part and run rsync to the destination over the network? Rsync is written to minimize network traffic at the cost of local IO, and if you're doing NFS then that "local IO" is really also network traffic. You also eliminate one potential source of problems in that case.> as far as I can tell, both systems work with UTF8 just fine (source is Ubuntu 14.04 and target is Debian Lenny) > > Now there seems to be a problem while gathering or transferring the file list, > as rsync tries to create files/folders that share a part with real files on the source, > but with additional characters, sometimes cut off, without the preceding parent folder et cetera.How often? Every file? 10% 1%? ...> The source file names in this case look like this: > > /var/backups/mail/redacted-domain/catchall/Maildir/.Sent/cur/1313508314.M654736P32713V0000000000000801I000B03CC_6.redacted-hostname,S=42352:2,S > /var/backups/mail/redacted-domain/info/Maildir/.Sent/cur/1313508314.M654736P32713V0000000000000801I000B03CC_6.redacted-hostname,S=42352:2,S > > but rsync fails on files like this, that clearly do not exist: > > skipping non-regular file "2713V0000000000000801I000B03CC_6.redacted-hostname,S=423\#001\#305\#001O\#233\#240?" > skipping non-regular file "2713V0000000000000801I000B03CC_6.redacted-hostname,S=42352:2,S" > skipping non-regular file "83E13498714.M297793P23544V000" > skipping non-regular file "? \#201" > skipping non-regular file "redacted-domain/catchall/Maildir/.Sent/cur/1301490998.M622842P6671V0000000000000801I00280BD9_0.redacted-hostname\#004" > skipping non-regular file "2713V0000000000000801I000B03CC_6.redacted-hostname,S=pedition/courierimapkeywords/:list"Is this reproducible, i.e. a second run (after cleaning up the mess it left behind) creates these same files again, or others? My first thought is that this combination of factors is triggering some sort of memory problems which is corrupting the filenames. It may also be useful to do a run with --checksum to catch any data corruption (or to see if it finds mismatches where there shouldn't). If this can be narrowed down to a fairly small transfer which goes wrong reproducibly, then using strace -f on rsync (with -o strace-output.txt) then perhaps you can see whether the errors already occur when reading the files or not. I have not heard of rsync performing this way, so I strongly suspect some hardware problem. Paul
Wayne Davison
2015-Jan-07 22:16 UTC
rsync splits filenames, creates special characters where none are, weird permissions
On Wed, Jan 7, 2015 at 7:16 AM, Lenz Weber <mail at lenzw.de> wrote:> [...] rsyncbackup at server:/var/backups/mail/. [...] >Does that login force a particular rsync command via ssh's authorized_keys file? It looks like the data stream is being garbled, and one way that could happen is if the remote rsync and the local rsync aren't using the right options (e.g. if a forced rsync command ignored some of the options from the controlling rsync). ..wayne.. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20150107/678ec7c7/attachment.html>
Lenz Weber
2015-Jan-07 22:26 UTC
rsync splits filenames, creates special characters where none are, weird permissions
Hi, Am 07.01.2015 um 18:25 schrieb Paul Slootman:> On Wed 07 Jan 2015, Lenz Weber wrote: > >> Where the local destination /data/snapshots is an NFS volume mounted with the flags >> (rw,noatime,addr=192.168.1.XX) >> and the source is a symlink to a zfs snapshot - that looks like this: >> /var/backups/mail -> /tank/mail/.zfs/snapshot/zfs-auto-snap_hourly-2015-01-07-1417 > > Why not skip the NFS part and run rsync to the destination over the > network? Rsync is written to minimize network traffic at the cost of > local IO, and if you're doing NFS then that "local IO" is really also > network traffic. You also eliminate one potential source of problems in > that case.If I were setting up a new backup host, I would consider this, but this is a "grown" platform - and as you know, those are now always easy (or quick) to change, so I'll have to stick to that solution for now. As for the "potential source of problems" part: This exact data set (source) was residing on another server (without the zfs setup) before, where it backed up just fine. So I think the problem is most likely to be found on the source part, not on the target part.> >> as far as I can tell, both systems work with UTF8 just fine (source is Ubuntu 14.04 and target is Debian Lenny) >> >> Now there seems to be a problem while gathering or transferring the file list, >> as rsync tries to create files/folders that share a part with real files on the source, >> but with additional characters, sometimes cut off, without the preceding parent folder et cetera. > > How often? Every file? 10% 1%? ...We're speaking of about a million files and a dozen errors on each transfer. But that's just all I can see - usually, the transfer cancels at one or the other point with different error messages so I can't say if there would be more errors if the transfer would complete. Some of these are (going back through my logs) #case 1: (most of the time i guess) rsync: connection unexpectedly closed (147733412 bytes received so far) [receiver] rsync error: error in rsync protocol data stream (code 12) at io.c(635) [receiver=3.0.3] rsync: connection unexpectedly closed (55 bytes received so far) [generator] rsync error: error in rsync protocol data stream (code 12) at io.c(635) [generator=3.0.3] #case 2: rsync: writefd_unbuffered failed to write 4 bytes [generator]: Broken pipe (32) rsync error: error in rsync protocol data stream (code 12) at io.c(1544) [generator=3.0.3] #case 3: unknown message 31:5178099 [generator] rsync error: error in rsync protocol data stream (code 12) at io.c(475) [generator=3.0.3] rsync error: received SIGUSR1 (code 19) at main.c(1304) [receiver=3.0.3]> >> The source file names in this case look like this: >> >> /var/backups/mail/redacted-domain/catchall/Maildir/.Sent/cur/1313508314.M654736P32713V0000000000000801I000B03CC_6.redacted-hostname,S=42352:2,S >> /var/backups/mail/redacted-domain/info/Maildir/.Sent/cur/1313508314.M654736P32713V0000000000000801I000B03CC_6.redacted-hostname,S=42352:2,S >> >> but rsync fails on files like this, that clearly do not exist: >> >> skipping non-regular file "2713V0000000000000801I000B03CC_6.redacted-hostname,S=423\#001\#305\#001O\#233\#240?" >> skipping non-regular file "2713V0000000000000801I000B03CC_6.redacted-hostname,S=42352:2,S" >> skipping non-regular file "83E13498714.M297793P23544V000" >> skipping non-regular file "? \#201" >> skipping non-regular file "redacted-domain/catchall/Maildir/.Sent/cur/1301490998.M622842P6671V0000000000000801I00280BD9_0.redacted-hostname\#004" >> skipping non-regular file "2713V0000000000000801I000B03CC_6.redacted-hostname,S=pedition/courierimapkeywords/:list" > > Is this reproducible, i.e. a second run (after cleaning up the mess it > left behind) creates these same files again, or others?Most of the time it is some kind of pattern within the same run, but different patterns between different runs.> > My first thought is that this combination of factors is triggering some > sort of memory problems which is corrupting the filenames. It may also > be useful to do a run with --checksum to catch any data corruption (or > to see if it finds mismatches where there shouldn't).I will try this. Though I will try disabling rssh first (Wayne Davison suggested in another mail that an enforced command could be the reason for that, I didn't think of that!). Will send more information tomorrow - let's see how it works out.> > If this can be narrowed down to a fairly small transfer which goes wrong > reproducibly, then using strace -f on rsync (with -o strace-output.txt) > then perhaps you can see whether the errors already occur when reading > the files or not. >I have tried it with significantly smaller datasets and could not reproduce the problem :(> > I have not heard of rsync performing this way, so I strongly suspect > some hardware problem. > > > Paul >Thank you very much so far, at least I'm not alone with this intimidating mess :) Regards, Lenz
Apparently Analagous Threads
- rsync splits filenames, creates special characters where none are, weird permissions
- rsync splits filenames, creates special characters where none are, weird permissions
- WERR_DS_DRA_MISSING_PARENT while Joining Samba4 DC to Samba4 Domain
- WERR_DS_DRA_MISSING_PARENT while Joining Samba4 DC to Samba4 Domain
- Bug: Dovecot appending "MISSING_DOMAIN" to fetch envelope responses