On Wed, Jan 15, 2003 at 11:50:27AM +0100, Harald Fielker
wrote:> Hi,
>
> i am using Rsync for making backups of a MySQL database. The MySQL files
can
> be compressed about 1:10 and i want to make use of this fact.
>
> Rsync currently doesn't support saving files in a compressed state. I
> personally think this should be a feature for the filesystem (in the sense
of
> "synchronised files") but currently there is no such filesystem
for Linux
> available.
e2compr is not dead. See http://www.alizt.com/
> Here my idea:
>
> We will have two new options:
>
> -X : this will specify a compress programm (e.g. gzip, bzip...) - the
default
> compressor is "gzip"
> -Z : this will activate storage file compression.
Why two options? Just specify the compressor and that
enables compression.
> If "-Z" is enabled. every name (files, directories, links, ...)
get's an
> extension called ".rsc".
And .rsc stands for what, rsync? Even windows has overcome
the three letter extension limit.
> If we have a true file, there is a header section and a data section. The
> header section will store the followin attributes:
>
> - magic number
> - unpacked size
> - packed size
> - compress programm (e.g. gzip, bzip2, ...)
> - magic number
So you add yet another compressed file format. There's
something the world is crying out for.
> After the header section we will have the compressed file using the
programm
> the user gave us with "-X"
>
> Every action in rsync will work - we will some exceptions:
>
> 1) Every file objects has the extension .rsc.
> 2) Doing simple checks (size, etc.) on files. the filesize needs evaluation
> for the .rsc header.
> 3) The local file needs to be decompressed when it is accessed for reading.
> 4) The local file needs to be compressed after it was modified or created.
A
> header section needs to be added.
> 5) The file stats (atime/ctime/mtime) will be applied to the .rsc file. In
> normal way.
>
> Problems/ideas:
>
> 1) On Unix this will allow us only files with names 255 -
strlen(".rsc") ...
> but this might be a very very rare case we will disable compression for
this
> single file.
Rsync already has issues with tempfile names. This is
shorter.
> 2) Rsync will need a new option for decompressing and stating the .rsc file
> tree. (single file, recursive)
>
> We should also offer options for validating .rsc files and converting a
tree
> to a .rsc filetree.
>
> I am sending some compressor patches. I am very new to the rsync source, so
> here a list of what i did:
>
> options.c
> - added -X and -Z options (-Z is passed thru a server wenn using
> user@host.foo:/directory)
>
> flist.c:
> extension ".rsc" is added to every file/directory (in -Z mode)
>
> rsync.c:
> finish_transfer() now does the compression when in -Z mode before stating
the
> file. That means the compressed file has the same stat as the uncompressed
> file.
>
> receiver.c:
> I added two new functions:
> - storage_decompress: this will decompress an .rsc file to a tmp file, e.g.
> for calculating sums (note: a delete function is missing!)
>
> - storage_decompress_update_stats: this will update a given stat structure
> with the decompressed filesize of the rsc file.
>
>
> Currently transfering new files and compressing works. But the receiver
> doesn't make use of the stats that storage_decompress_update_stats. I
don't
> know if i am calling it at the right place. I also don't know if the
sum is
> allways calculated for a file. If this is the case we need to store the md4
> sum in the .rsc header.
While the idea of rsyncing with compression is mildly
attractive i can't say i care for the new compression
format. It would be better just to use the standard gzip or
other format. If you are going to create a new file type
you could at least discuss storing the blocksums in it so
that the receiver wouldn't have to generate them.
Finally, i didn't even look at your patch because it was not
text/plain. Unless absolutly necessary patches should be
either inline or text/plain attachments.
--
________________________________________________________________
J.W. Schultz Pegasystems Technologies
email address: jw@pegasys.ws
Remember Cernan and Schmitt