Daniel Mare
2014-Feb-05 05:00 UTC
Feature Request: don't sync if it would result in more than NUM deletions.
As a safety feature, I would like to see a feature that would prevent rsync from syncing when the sync, if it were to go ahead, would result in more than a certain number of files being deleted from the destination. A similar feature, --max-delete, does exist, but does not prevent rsync from doing a partial run when --max-delete entries would be exceeded - it simply deletes up to that amount of files. In scheduled scripts, this would result in damage being done a bit at a time at each run, so not really a safety feature and additionally, even if problem caught after only one run, it still means the clone is in an unknown state (at least without processing thousands of log delete entries). A workaround I considered was doing a dry-run first, then only proceeding with the rsync when the dry-run doesn't exit with exit codes 25, but this is extremely inefficient - completely impractical for e.g. syncing large data stores overnight (4Tb in my case). I suspect there would be reasonable demand for and use of this feature by others.
Kevin Korb
2014-Feb-05 05:32 UTC
Feature Request: don't sync if it would result in more than NUM deletions.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 You could do a --dry-run with --max-deletes and check for exit code 25. Then only do the real run if not exit code 25. On 02/05/2014 12:00 AM, Daniel Mare wrote:> As a safety feature, I would like to see a feature that would > prevent rsync from syncing when the sync, if it were to go ahead, > would result in more than a certain number of files being deleted > from the destination. > > A similar feature, --max-delete, does exist, but does not prevent > rsync from doing a partial run when --max-delete entries would be > exceeded - it simply deletes up to that amount of files. > > In scheduled scripts, this would result in damage being done a bit > at a time at each run, so not really a safety feature and > additionally, even if problem caught after only one run, it still > means the clone is in an unknown state (at least without processing > thousands of log delete entries). > > A workaround I considered was doing a dry-run first, then only > proceeding with the rsync when the dry-run doesn't exit with exit > codes 25, but this is extremely inefficient - completely > impractical for e.g. syncing large data stores overnight (4Tb in my > case). > > I suspect there would be reasonable demand for and use of this > feature by others. > >- -- ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~ Kevin Korb Phone: (407) 252-6853 Systems Administrator Internet: FutureQuest, Inc. Kevin at FutureQuest.net (work) Orlando, Florida kmk at sanitarium.net (personal) Web page: http://www.sanitarium.net/ PGP public key available on web site. ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~ -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlLxzPgACgkQVKC1jlbQAQdIkACgx77H4g9atKlXEhSqi1plHDeF ULsAn0onJu0UJNTgQjFP/kLi+24uk7mA =dv08 -----END PGP SIGNATURE-----
Matthias Schniedermeyer
2014-Feb-05 08:27 UTC
Feature Request: don't sync if it would result in more than NUM deletions.
On 05.02.2014 13:00, Daniel Mare wrote:> As a safety feature, I would like to see a feature that would prevent rsync from syncing when the sync, if it were to go ahead, would result in more than a certain number of files being deleted from the destination. > > A similar feature, --max-delete, does exist, but does not prevent rsync from doing a partial run when --max-delete entries would be exceeded - it simply deletes up to that amount of files. > > In scheduled scripts, this would result in damage being done a bit at a time at each run, so not really a safety feature and additionally, even if problem caught after only one run, it still means the clone is in an unknown state (at least without processing thousands of log delete entries). > > A workaround I considered was doing a dry-run first, then only proceeding with the rsync when the dry-run doesn't exit with exit codes 25, but this is extremely inefficient - completely impractical for e.g. syncing large data stores overnight (4Tb in my case).What you describe can only be done in a non-incremental way. So it's a 2-pass thing anyway and the question is: Does the metadata of all files fit into RAM? Because when doing either a separate dry-run or runing in non-incremental mode, the difference in run-time depends on metadata fiting into RAM and how many files ultimatly have to be transfered. If there is a non-negliable amount of files to be transfered and the metadata does NOT fit into RAM there likely isn't much of a difference. If all metadata fits into RAM, a seperate dry-run doesn't hurt much. And if all else fails, you may try to shard the transfer into pieces that fit into RAM and then doing a dry-run & transfer for each piece. -- Matthias