I want to merge 3 slightly different directories of mostly images. Not just mostly but the vast majority are images files. Each directory has about 285 GB of files. At first I thought I would just run a straightish rsync from each directory inturn starting with the biggest which is not much bigger ... maybe a few MB. Like: rsync -vvrptgoD --stats /biggest/ /emptydir/ rsync -vvrptgoD --stats /next-biggest/ /same-dir/ rsync -vvrptgoD --stats /smallest/ /same-dir But after some thought I'm guessing that might be wrong headed way to go. All three dir have mostly the same stuff in them and in the same places but a close inspection, given the 285 GB would be pretty much a non-starter. There will be thousands that have matching names maybe newer or older bigger etc. And maybe some of the same stuff but in slightly different places. How can I make rsync do the work for me? So I don't end up loosing files.
I suspect you want a duplicate finder more than a file transfer tool. EG: https://stromberg.dnsalias.org/~strombrg/equivalence-classes.html On Tue, Jun 7, 2022 at 5:36 PM hput via rsync <rsync at lists.samba.org> wrote:> I want to merge 3 slightly different directories of mostly images. > > Not just mostly but the vast majority are images files. > > Each directory has about 285 GB of files. > > At first I thought I would just run a straightish rsync from each directory > inturn starting with the biggest which is not much bigger ... maybe > a few MB. > > Like: > > rsync -vvrptgoD --stats /biggest/ /emptydir/ > > rsync -vvrptgoD --stats /next-biggest/ /same-dir/ > > rsync -vvrptgoD --stats /smallest/ /same-dir > > But after some thought I'm guessing that might be wrong headed way to go. > > All three dir have mostly the same stuff in them and in the same > places but a close inspection, given the 285 GB would be pretty much a > non-starter. > > There will be thousands that have matching names maybe newer or older > bigger etc. And maybe some of the same stuff but in slightly different > places. > > How can I make rsync do the work for me? So I don't end up loosing files. > > > > -- > Please use reply-all for most replies to avoid omitting the mailing list. > To unsubscribe or change options: > https://lists.samba.org/mailman/listinfo/rsync > Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20220607/de663336/attachment.htm>
It would help if you gave us an example of what you'd *want* to have happen in different situations, but what about the -b option? This will do nothing with identical files but keep both versions of non-identical ones. On Wed, Jun 08, 2022 at 12:24:16AM +0000, hput via rsync wrote:> I want to merge 3 slightly different directories of mostly images. > > Not just mostly but the vast majority are images files. > > Each directory has about 285 GB of files. > > At first I thought I would just run a straightish rsync from each directory > inturn starting with the biggest which is not much bigger ... maybe > a few MB. > > Like: > > rsync -vvrptgoD --stats /biggest/ /emptydir/ > > rsync -vvrptgoD --stats /next-biggest/ /same-dir/ > > rsync -vvrptgoD --stats /smallest/ /same-dir > > But after some thought I'm guessing that might be wrong headed way to go. > > All three dir have mostly the same stuff in them and in the same > places but a close inspection, given the 285 GB would be pretty much a > non-starter. > > There will be thousands that have matching names maybe newer or older > bigger etc. And maybe some of the same stuff but in slightly different places. > > How can I make rsync do the work for me? So I don't end up loosing files. > > > > -- > Please use reply-all for most replies to avoid omitting the mailing list. > To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync > Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html