thr3ads.net - CentOS - [CentOS] Need help to fix bug in rsync [Mar 2020]

If this information is useful, please help other people find it:
Share via:

Simon Matter

2020-Mar-25 18:15 UTC

[CentOS] Need help to fix bug in rsync

> On Wed, 2020-03-25 at 14:39 +0000, Leroy Tennison wrote:
>> Since you state that using -z is almost always a bad idea, could you
>> provide the rationale for that?  I must be missing something.
>>
> I think the "rationale" is that at some point the
> compression/decompression takes longer than the time reduction from
> sending a compressed file.  It depends on the relative speeds of the
> machines and the network.
>
> You have most to gain from compressing large files, but if they are
> already compressed, then you have nothing to gain from just doing small
> files.
>
> It obviously depends on your network speed and if you have a metered
> connection, but does anyone really have such an ancient network
> connection still these days - I mean if you have fast enough machines
> at both ends to do rapid compression/decompression, it seems unlikely
> that you will have a damp piece of string connecting them.
I really don't understand the discussion here. What is wrong with using -z
with rsync? We're using rsync with -z for backups and just don't want to
waste bandwidth for nothing. We have better use for our bandwidth and it
makes quite a difference when backing up terabytes of data.

The only reason why I asked for help is because we don't want to double
compress data which is already compressed. This is what currently is
broken in rsync without manually specifying a skip-compress list. Fixing
it would help all those who don't know it's broken now.

Thanks,
Simon

Pete Biggs

2020-Mar-25 18:32 UTC

head link

[CentOS] Need help to fix bug in rsync

On Wed, 2020-03-25 at 19:15 +0100, Simon Matter via CentOS
wrote:> > On Wed, 2020-03-25 at 14:39 +0000, Leroy Tennison wrote:
> > > Since you state that using -z is almost always a bad idea, could
you
> > > provide the rationale for that?  I must be missing something.
> > > 
> > I think the "rationale" is that at some point the
> > compression/decompression takes longer than the time reduction from
> > sending a compressed file.  It depends on the relative speeds of the
> > machines and the network.
> > 
> > You have most to gain from compressing large files, but if they are
> > already compressed, then you have nothing to gain from just doing
small
> > files.
> > 
> > It obviously depends on your network speed and if you have a metered
> > connection, but does anyone really have such an ancient network
> > connection still these days - I mean if you have fast enough machines
> > at both ends to do rapid compression/decompression, it seems unlikely
> > that you will have a damp piece of string connecting them.
> 
> I really don't understand the discussion here. What is wrong with using
-z
> with rsync? We're using rsync with -z for backups and just don't
want to
> waste bandwidth for nothing. We have better use for our bandwidth and it
> makes quite a difference when backing up terabytes of data.
I don't really care if you use -z, but you asked for the rationale, and
I gave you it. I'm not telling you what you should do.

I'll try and make it simpler - if rsync takes 1 second to compress the
file, then 1 second to decompress the file, and the whole transfer of
the file takes 11 seconds uncompressed vs 10 seconds compressed, then
dealing with file takes overall 12 seconds compressed, vs 11 seconds
uncompressed. It's not worth it. 

But as I said it depends on your network and your machine speeds.  It's
up to you to decide what is best in your own situation.

P.

Leroy Tennison

2020-Mar-25 18:35 UTC

head link

[CentOS] Need help to fix bug in rsync

That's why I asked, I wanted to know if there was something inherently bad
with "-z".  I had a situation where Postgresql was replicating 16M
files every few minutes ("log shipping") on approximately 10 systems,
got behind which resulted in almost continuous file transfer (of mostly null 16M
files) and saturated the common link.  Specifying compression with file transfer
cut transmission time by 5-10x resolving the problem.

________________________________
From: CentOS <centos-bounces at centos.org> on behalf of Simon Matter via
CentOS <centos at centos.org>
Sent: Wednesday, March 25, 2020 1:15 PM
To: CentOS mailing list <centos at centos.org>
Subject: [EXTERNAL] Re: [CentOS] Need help to fix bug in rsync

&g

Harriscomputer

Leroy Tennison
Network Information/Cyber Security Specialist
E: leroy at datavoiceint.com

[cid:Data-Voice-International-LOGO_aa3d1c6e-5cfb-451f-ba2c-af8059e69609.PNG]

2220 Bush Dr
McKinney, Texas
75070
www.datavoiceint.com<http://www..com>

This message has been sent on behalf of a company that is part of the Harris
Operating Group of Constellation Software Inc.

If you prefer not to be contacted by Harris Operating Group please notify
us<http://subscribe.harriscomputer.com/>.

This message is intended exclusively for the individual or entity to which it is
addressed. This communication may contain information that is proprietary,
privileged or confidential or otherwise legally exempt from disclosure. If you
are not the named addressee, you are not authorized to read, print, retain, copy
or disseminate this message or any part of it. If you have received this message
in error, please notify the sender immediately by e-mail and delete all copies
of the message.

t; On Wed, 2020-03-25 at 14:39 +0000, Leroy Tennison
wrote:>> Since you state that using -z is almost always a bad idea, could you
>> provide the rationale for that?  I must be missing something.
>>
> I think the "rationale" is that at some point the
> compression/decompression takes longer than the time reduction from
> sending a compressed file.  It depends on the relative speeds of the
> machines and the network.
>
> You have most to gain from compressing large files, but if they are
> already compressed, then you have nothing to gain from just doing small
> files.
>
> It obviously depends on your network speed and if you have a metered
> connection, but does anyone really have such an ancient network
> connection still these days - I mean if you have fast enough machines
> at both ends to do rapid compression/decompression, it seems unlikely
> that you will have a damp piece of string connecting them.
I really don't understand the discussion here. What is wrong with using -z
with rsync? We're using rsync with -z for backups and just don't want to
waste bandwidth for nothing. We have better use for our bandwidth and it
makes quite a difference when backing up terabytes of data.

The only reason why I asked for help is because we don't want to double
compress data which is already compressed. This is what currently is
broken in rsync without manually specifying a skip-compress list. Fixing
it would help all those who don't know it's broken now.

Thanks,
Simon

_______________________________________________
CentOS mailing list
CentOS at centos.org
https://lists.centos.org/mailman/listinfo/centos

Leroy Tennison

2020-Mar-25 18:40 UTC

head link

[CentOS] Need help to fix bug in rsync

I appreciate the reply - it keeps me from wondering "is there something I
should be concerned about?".  We use a co-location facility where we pay
for bandwidth utilization so it's still an issue.

________________________________
From: CentOS <centos-bounces at centos.org> on behalf of Pete Biggs
<pete at biggs.org.uk>
Sent: Wednesday, March 25, 2020 1:32 PM
To: centos at centos.org <centos at centos.org>
Subject: [EXTERNAL] Re: [CentOS] Need help to fix bug in rsync

Harriscomputer

Leroy Tennison
Network Information/Cyber Security Specialist
E: leroy at datavoiceint.com

[cid:Data-Voice-International-LOGO_aa3d1c6e-5cfb-451f-ba2c-af8059e69609.PNG]

2220 Bush Dr
McKinney, Texas
75070
www.datavoiceint.com<http://www..com>

This message has been sent on behalf of a company that is part of the Harris
Operating Group of Constellation Software Inc.

If you prefer not to be contacted by Harris Operating Group please notify
us<http://subscribe.harriscomputer.com/>.

This message is intended exclusively for the individual or entity to which it is
addressed. This communication may contain information that is proprietary,
privileged or confidential or otherwise legally exempt from disclosure. If you
are not the named addressee, you are not authorized to read, print, retain, copy
or disseminate this message or any part of it. If you have received this message
in error, please notify the sender immediately by e-mail and delete all copies
of the message.

On Wed, 2020-03-25 at 19:15 +0100, Simon Matter via CentOS
wrote:> > On Wed, 2020-03-25 at 14:39 +0000, Leroy Tennison wrote:
> > > Since you state that using -z is almost always a bad idea, could
you
> > > provide the rationale for that?  I must be missing something.
> > >
> > I think the "rationale" is that at some point the
> > compression/decompression takes longer than the time reduction from
> > sending a compressed file.  It depends on the relative speeds of the
> > machines and the network.
> >
> > You have most to gain from compressing large files, but if they are
> > already compressed, then you have nothing to gain from just doing
small
> > files.
> >
> > It obviously depends on your network speed and if you have a metered
> > connection, but does anyone really have such an ancient network
> > connection still these days - I mean if you have fast enough machines
> > at both ends to do rapid compression/decompression, it seems unlikely
> > that you will have a damp piece of string connecting them.
>
> I really don't understand the discussion here. What is wrong with using
-z
> with rsync? We're using rsync with -z for backups and just don't
want to
> waste bandwidth for nothing. We have better use for our bandwidth and it
> makes quite a difference when backing up terabytes of data.
I don't really care if you use -z, but you asked for the rationale, and
I gave you it. I'm not telling you what you should do.

I'll try and make it simpler - if rsync takes 1 second to compress the
file, then 1 second to decompress the file, and the whole transfer of
the file takes 11 seconds uncompressed vs 10 seconds compressed, then
dealing with file takes overall 12 seconds compressed, vs 11 seconds
uncompressed. It's not worth it.

But as I said it depends on your network and your machine speeds.  It's
up to you to decide what is best in your own situation.

P.

_______________________________________________
CentOS mailing list
CentOS at centos.org
https://lists.centos.org/mailman/listinfo/centos

Leon Fauster

2020-Mar-25 19:17 UTC

head link

[CentOS] Need help to fix bug in rsync

Am 25.03.20 um 19:15 schrieb Simon Matter via CentOS:>> On Wed, 2020-03-25 at 14:39 +0000, Leroy Tennison wrote:
>>> Since you state that using -z is almost always a bad idea, could
you
>>> provide the rationale for that?  I must be missing something.
>>>
>> I think the "rationale" is that at some point the
>> compression/decompression takes longer than the time reduction from
>> sending a compressed file.  It depends on the relative speeds of the
>> machines and the network.
>>
>> You have most to gain from compressing large files, but if they are
>> already compressed, then you have nothing to gain from just doing small
>> files.
>>
>> It obviously depends on your network speed and if you have a metered
>> connection, but does anyone really have such an ancient network
>> connection still these days - I mean if you have fast enough machines
>> at both ends to do rapid compression/decompression, it seems unlikely
>> that you will have a damp piece of string connecting them.
> 
> I really don't understand the discussion here. What is wrong with using
-z
> with rsync? We're using rsync with -z for backups and just don't
want to
> waste bandwidth for nothing. We have better use for our bandwidth and it
> makes quite a difference when backing up terabytes of data.
> 
> The only reason why I asked for help is because we don't want to double
> compress data which is already compressed. This is what currently is
> broken in rsync without manually specifying a skip-compress list. Fixing
> it would help all those who don't know it's broken now.
> 
Until this is fixed; as a workaround I would do a two-pass transfer with
filters via ".rsync-filter" file and then using rsync -azvF for
everything with high compression ratio and rsync -av for all, including
compressed data.
So, ".rsync-filter" includes the exclude statements for compressed
formats. This all makes only sense if the compression ratio is higher
then the meta data transfer of the second run ...

--
Leon

Reasonably Related Threads

Search for more seemingly similar threads

CentOS - Mar 2020 - Need help to fix bug in rsync

[CentOS] Need help to fix bug in rsync

[CentOS] Need help to fix bug in rsync

[CentOS] Need help to fix bug in rsync

[CentOS] Need help to fix bug in rsync

[CentOS] Need help to fix bug in rsync

Reasonably Related Threads