samba-bugs at samba.org
2013-Jul-27 18:14 UTC
[Bug 10051] New: Improved long file-name handling
https://bugzilla.samba.org/show_bug.cgi?id=10051
Summary: Improved long file-name handling
Product: rsync
Version: 3.1.0
Platform: All
OS/Version: All
Status: NEW
Severity: normal
Priority: P5
Component: core
AssignedTo: wayned at samba.org
ReportedBy: me at haravikk.com
QAContact: rsync-qa at samba.org
One of the issues with rsync between two different systems is the possibility
of file-systems with stricter limitations on the length of a file name or even
file path. Now, the latter I'm not sure can be resolved easily, but long
names
cause two main errors:
rsync: recv_generator: failed to stat "/foo/really_long_name": File
name too
long (36)
rsync: mkstemp "/foo/" failed: No such file or directory (2)
Basically any attempt to stat an existing file on the receiving end will fail
(it probably isn't there anyway). mkstemp then later fails presumably
because
the temporary name is too long so no file is actually created, it then creates
the strange second error which will report the target root as not existing,
even though it does.
What I would like to propose is a new feature for handling long file-names, by
adding something like the following:
--long-hash (md5|sha1|sha2|none)
--long-hash-ext .rsync.hashed
Quite simply, if a file-name is encountered that is too long for the target
file-system, then it is run through the specified hashing algorithm, with the
resulting hash being used as the name instead when transferring the file (or
looking for an existing file).
The default setting of none would throw an rsync error instead, with the
assumption being that renaming the file could introduce errors. For example if
you were rsyncing an application bundle but something was renamed then the
cloned application may not be functional, so an error would be preferable.
However, if you're using rsync for a backup then you may be okay with
renaming
the file to ensure that it is at least copied.
A possible to addition to this feature would be:
--long-hash-namefile *.rsync.name
Basically this lets you choose a format for a name-file; any file that has to
have its named hashed would have a name-file created alongside it using the
specified format. If rsync is sending a hashed file with a matching name-file
then it can open this in order to restore the original file-name.
For example:
I want to rsync the filename "hugefilename.txt", with the md5
algorithm set
rsync will send this as
"520b0999cd97ae3af36744e0f9cb1839.rsync.hashed" and
create alongside it a file named
"520b0999cd97ae3af36744e0f9cb1839.rsync.name"
containing the original filename of "huge filename".
Of course naming of the parameters is entirely for example purposes, but
hopefully you get the idea. Basically a file with long filename has the name
hashed and a suitable extension added, if rsync encounters a file with that
extension then it can look for a name-file to expand. When syncing to a folder,
if a file has a long file-name then rsync can hash that file-name and look for
a .rsync.hashed file to run its usual checks against.
--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
samba-bugs at samba.org
2019-Aug-03 12:00 UTC
[Bug 10051] Improved long file-name handling
https://bugzilla.samba.org/show_bug.cgi?id=10051
--- Comment #1 from Haravikk <samba at haravikk.com> ---
Wow, was about to post basically this same feature, forgetting I'd already
requested it six years ago!
There's definitely still an argument to be made for rsync to handle file
names
better when they are invalid on the target device, however my original proposal
is far too basic.
I'd like to propose the following altered options:
--rename-dest [error|md5|sha1|sha2]
Determines the behaviour when a filename from the source is invalid on the
target,
either due to length or invalid characters. By default, an error is
produced,
otherwise a hashing algorithm can be specified to create a compact new name
for
the file.
--rename-dest-ext .rsync
Sets a file extension for renamed files.
--dest-meta .meta
Specifies the file extension to use for meta files, into which additional
data
about a file's transfer can be written. For example, if a file is
renamed,
then a
file with the same hashed name but this extension will be created,
containing the
original name of the file. For example, a file called
"birthday/anniversary.jpg"
is invalid on the target and so is renamed
a1df35adf4b3df93458d84c014b56465.rsync
and alongside it is stored a1df35adf4b3df93458d84c014b56465.meta with the
line:
name:25:birthday/anniversay.jpg
Note the length is specified so characters in the original name cannot
interfere
with the meta file itself.
--source-meta .meta
Specifies the file extension used for meta files on the source side of the
transfer, allowing rsync to check for such files and use them when
transferring
files. For example, in the case of a renamed file the meta file will
contain the
original file name, allowing rsync to attempt to transfer the file under
its
original name, if the new target supports it (e.g- a transfer to original
source).
--meta .meta
Shorthand for specifying both --dest-meta and --source-meta at the same
time.
Maybe there's still a more elegant way to do this? What's certain at
least is
that rsync could really use a way to more reliably handle files that cannot be
transferred properly.
I opted to go for a generic meta file concept as it's possible rsync could
use
this for other features in future, for example, files too large for a target
filesystem, could be split, with a meta file entry detailing how to reassemble
them from smaller pieces.
--
You are receiving this mail because:
You are the QA Contact for the bug.
Apparently Analagous Threads
- [Bug 10575] New: Long Delay for Large Folders Even with Incremental File-List
- [Bug 12570] New: Problems with --checksum --existing
- [Bug 10379] New: rsync metadata files
- [Bug 14371] New: Combined Exclude & Protect Filter Type
- Disable Client Certificate Authentication for Unencrypted Connections?