samba-bugs@samba.org
2006-Oct-11 05:57 UTC
DO NOT REPLY [Bug 4162] New: Wanted: a mechanism to prevent rsync network compression of compressed files
https://bugzilla.samba.org/show_bug.cgi?id=4162
Summary: Wanted: a mechanism to prevent rsync network compression
of compressed files
Product: rsync
Version: 3.0.0
Platform: PPC
OS/Version: Mac OS X
Status: NEW
Severity: enhancement
Priority: P3
Component: core
AssignedTo: wayned@samba.org
ReportedBy: rsync@name99.org
QAContact: rsync-qa@samba.org
rsync, of course, offers the -z flag to allow for the transfer of compressed
file over the network during a backup.
My experience using this flag has been that (on a 1GHz PPC laptop connected
using 802.11g)
- without using the -z flag the average data rate over the connection is about
2.5MB/s which is about where you'd expect 802.11g to max out, all things
considered
- with using the -z flag the average data rate over the connection is about
1MB/s, and the CPU is maxed out.
Now, if what was being transferred was a stream of text (compression ratio of
say 4 or so), this would still be a win. But on a modern personal system, the
bulk of the material transferred (by bytes, not by file number) is going to be
photos, audio files and video files, ie already compressed stuff, so the mean
compression rate over the entire stream of data is going to be just a bit over
1, and using -z is a loss.
The obvious issue, then, is how can we get the goodness of -z for text files,
while avoiding the cost of the CPU to compress files that aren't going to
compress much.
Two obvious strategies spring to mind:
* We could track the progress of the compression and bail out if it is less
than some lower limit, maybe 1.2 or so. Maybe run compression till about 8KiB
into the file, see how things are going and coontinue compressed or switch to
uncompressed. AND/OR
* We could simply allow for a user-supplied list of files, (presumably the same
syntax as backup_excludes) that we would not bother to try to compress. This
may not be as robust a strategy as the first scheme, but it is much easier to
program, and should be good enough for most purposes. I'd recommend in
addition
that rsync ship with a starter template for this file that includes all the
usual suspects from *.gzip through *.mp3, *.mov, *ogg etc etc.
--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2007-Jul-12 10:15 UTC
DO NOT REPLY [Bug 4162] Wanted: a mechanism to prevent rsync network compression of compressed files
https://bugzilla.samba.org/show_bug.cgi?id=4162
boris@folgmann.de changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |boris@folgmann.de
------- Comment #1 from boris@folgmann.de 2007-07-12 05:14 CST -------
I would also like to see such a function. On Linux the file command could be
used to determine the exact file type, which is more robust than using a
built-in or user-supplied list of file extentions (like .gz, .bz2, .jpg and so
on)
Instead of calling the file command rsync can directly use libmagic. That
should also be possible on non-UNIX systems I think, since the library is
surely portable.
The other solution by testing compression on the first 8k is also a very good
one, that might be even faster to implement.
--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2007-Jul-14 19:46 UTC
DO NOT REPLY [Bug 4162] Wanted: a mechanism to prevent rsync network compression of compressed files
https://bugzilla.samba.org/show_bug.cgi?id=4162
wayned@samba.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #2 from wayned@samba.org 2007-07-14 14:46 CST -------
Rsync was already skipping the list of file suffixes that were listed under the
"dont compress" option in the daemon manpage (even though that
wasn't clear
from the docs).
I added a --skip-compress=LIST option that allows the user to specify a list of
file suffixes to not compress. When this option is specified, it overrides the
default list except when pulling from a daemon, where it is appended to the
daemon's rules.
I also added several suffixes to the default list and made the suffix-matching
code much faster (so that the expanded list will not slow things down).
If anyone has suggestions for more suffixes that should be skipped by default,
let me know.
--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.
Reasonably Related Threads
- Fwd: Re: How to use multiple link-dest directories?
- DO NOT REPLY [Bug 2294] Detect renamed files and handle by renaming instead of delete/re-send
- How to use multiple link-dest directories?
- Problem with excludes and includes
- DO NOT REPLY [Bug 6603] New: Improve --skip-compress default values