samba-bugs at samba.org
2015-Jan-22 14:38 UTC
[Bug 11067] New: add --min-depth and --max-depth options
https://bugzilla.samba.org/show_bug.cgi?id=11067 Bug ID: 11067 Summary: add --min-depth and --max-depth options Product: rsync Version: 3.1.1 Hardware: All OS: All Status: NEW Severity: enhancement Priority: P5 Component: core Assignee: wayned at samba.org Reporter: chip at innovates.com QA Contact: rsync-qa at samba.org As file systems are getting bigger and bigger, into the multiple petabyte scale rsync is not scaling well, but is still the tool of choice when migrating filesystems from one storage vender to another. Adding --min-depth and --max-depth options to control what directory depth rsync will operate on would allow better targeted rsync processes. -- You are receiving this mail because: You are the QA Contact for the bug.
Maybe a bit off topic. (I don't deal with any data even remotely that large.) How would you use these new options - just as a way to break large tasks into smaller "batches"? If rsync "stops in the middle", then the target would be in a sort of limbo where it might not be fully usable. Joe On 01/22/2015 09:38 AM, samba-bugs at samba.org wrote:> https://bugzilla.samba.org/show_bug.cgi?id=11067 > > Bug ID: 11067 > Summary: add --min-depth and --max-depth options > Product: rsync > Version: 3.1.1 > Hardware: All > OS: All > Status: NEW > Severity: enhancement > Priority: P5 > Component: core > Assignee: wayned at samba.org > Reporter: chip at innovates.com > QA Contact: rsync-qa at samba.org > > As file systems are getting bigger and bigger, into the multiple petabyte scale > rsync is not scaling well, but is still the tool of choice when migrating > filesystems from one storage vender to another. > > Adding --min-depth and --max-depth options to control what directory depth > rsync will operate on would allow better targeted rsync processes. >
Matthias Schniedermeyer
2015-Jan-22 21:01 UTC
[Bug 11067] New: add --min-depth and --max-depth options
On 22.01.2015 14:32, Joe wrote:> Maybe a bit off topic. (I don't deal with any data even remotely that > large.) > > How would you use these new options - just as a way to break large tasks > into smaller "batches"? > If rsync "stops in the middle", then the target would be in a sort of > limbo where it might not be fully usable.If i would guess. The description say to me: "big fat storage system underneath". Which means: "Bandwith limited" by a singular rsync process. So: First you do a "--max-depth=2" (or so) run to set a basis. Then you can do several "-min-depth=2" runs (for different parts) in parallel, to get the blood of the storage system pumping. That also uses more than 1 CPU, which a singular rsync would be limited to. If you have a "big fat" server with a truckload of cores and an I/O-System that can do several GB/s, the about 500MB/s i get on my personal computer for a singular rsync is a limiting factor. Or just think about the upcoming PCIe NVMe SSDs that can do several GB/s (and plug several of them into a computer). You need several rsync processes to spread the I/O load over enough CPUs just for duplicating a single such SSD to another, if you want to do that in the shortest time possible.> On 01/22/2015 09:38 AM, samba-bugs at samba.org wrote: > > https://bugzilla.samba.org/show_bug.cgi?id=11067 > > > > Bug ID: 11067 > > Summary: add --min-depth and --max-depth options > > Product: rsync > > Version: 3.1.1 > > Hardware: All > > OS: All > > Status: NEW > > Severity: enhancement > > Priority: P5 > > Component: core > > Assignee: wayned at samba.org > > Reporter: chip at innovates.com > > QA Contact: rsync-qa at samba.org > > > > As file systems are getting bigger and bigger, into the multiple petabyte scale > > rsync is not scaling well, but is still the tool of choice when migrating > > filesystems from one storage vender to another. > > > > Adding --min-depth and --max-depth options to control what directory depth > > rsync will operate on would allow better targeted rsync processes. > >-- Matthias