Hello, I am testing btrfs on one of our backup servers (many millions of files, 1.5TB size, running on (non-btrfs-provided-) raid5). I am using subvolumes/snapshots with following rsync. It works very well, but I would like to ask a question... say I would need to copy/move the files to different server/disk. Normally I would do it with rsync, but I guess it will not preserve the subvolumes, it will also not detect that they are the same files (I guess they are not just normal hardlinks). So I would end up with duplicated files. What is the correct way to do this? Thank you and best regards Lubos Kolouch -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 1 July 2010 11:28, Lubos Kolouch <lubos.kolouch@gmail.com> wrote:> Hello, > > I am testing btrfs on one of our backup servers > (many millions of files, 1.5TB size, running on (non-btrfs-provided-) > raid5). > > I am using subvolumes/snapshots with following rsync. > > It works very well, but I would like to ask a question... say I would need > to copy/move the files to different server/disk. > > Normally I would do it with rsync, but I guess it will not preserve the > subvolumes, it will also not detect that they are the same files (I guess > they are not just normal hardlinks). So I would end up with duplicated > files. > > What is the correct way to do this?The only way to do this preserving duplication is to use hardlinks between duplicated files (which reference counts the inode), and use ''rsync -H''. Dan -- Daniel J Blueman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Daniel J Blueman, Thu, 01 Jul 2010 12:26:10 +0100:>> What is the correct way to do this? > > The only way to do this preserving duplication is to use hardlinks > between duplicated files (which reference counts the inode), and use > ''rsync -H''. > > DanBut when the files are on different snaphots, does rsync see them as hardlinked? A scenario - I have raid5 of say, 1TB HDDs. It contains many snapshots. Then, few years later, new machine is bought and there are, say, 5TB discs. So I need to transfer the btrfs volume to the new machine. But how to do it so that it looks the *same*, ie. the same snapshots? I could of course write a custom script to create the subvolume, rsync the files, create snapshot, rsync files, etc, but it would be nice if the btrfs toolset supports this by default... Lubos -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/01/2010 05:33 AM, Lubos Kolouch wrote:> Daniel J Blueman, Thu, 01 Jul 2010 12:26:10 +0100: >>> What is the correct way to do this? >> >> The only way to do this preserving duplication is to use hardlinks >> between duplicated files (which reference counts the inode), and use >> ''rsync -H''. >> >> DanHello, With backed up files consisting of hard links, I usually use dd to copy the file systems at the block level # dd if=/dev/sda of=/dev/sdb bs=20M and then expand the file system. This is because I found that tools like rsync, while usually fast, are extremely slow when dealing with millions of hard linked files. This could also be used for btrfs to keep its snapshots.> A scenario - I have raid5 of say, 1TB HDDs. It contains many snapshots. > Then, few years later, new machine is bought and there are, say, 5TB > discs. > ... > LubosFor me, I had to copy over BackupPC hardlinked files from a full disk to a smaller disk, both using ext4, and I could not use dd. What normally should have taken an hour, instead took almost a week. (Yes, I wanted to use btrfs, but it had a hard link limit of 255 - don''t know if it still does.) It would be nice to have a btrfs command that could rapidly copy over the file system, snapshots, and all other file system info. But what benefit would having a native btrfs ''copy/rsync'' command have over the dd/resize option? Pros - Files will be immediately checksumed on new disks, but this may not be as important since a checksum/verify command will be implemented. - Great ''feature'' for copying files to new drives, and keeping snapshots. Could even be used to export snapshots. - I believe compressed files will have to be uncompressed and recompressed, depending on when file is checksummed. (I may be wrong on this one). This will actually be a con for slow and/or high load machines. - One command instead of many (dd -> resize -> verify). Cons - File system would still have to be unmounted, or at least read-only, as I doubt the command will have rsync''s update or delete abilities. But, maybe it could. Questionable - May be faster than dd/resize, or it may be just as slow as rsync is with hard links. And I am talking about dozens to thousands of snapshots, and millions to billions of files. Matt -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 01, 2010 at 11:33:59AM +0000, Lubos Kolouch wrote:> Daniel J Blueman, Thu, 01 Jul 2010 12:26:10 +0100: > >> What is the correct way to do this? > > > > The only way to do this preserving duplication is to use hardlinks > > between duplicated files (which reference counts the inode), and use > > ''rsync -H''. > > > > Dan > > But when the files are on different snaphots, does rsync see them as > hardlinked? > > A scenario - I have raid5 of say, 1TB HDDs. It contains many snapshots. > Then, few years later, new machine is bought and there are, say, 5TB > discs. > > So I need to transfer the btrfs volume to the new machine. > > But how to do it so that it looks the *same*, ie. the same snapshots? > I could of course write a custom script to create the subvolume, rsync > the files, create snapshot, rsync files, etc, > > but it would be nice if the btrfs toolset supports this by default...This is definitely something I''m looking to add. The btrfs-progs git tree has some code that allows userland to walk the btrees and detect the duplicate files. But this is just a building block needed for the full backup program. Instead of hard links, it is possible to use reflinks with cp, which uses the cloning ioctl. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
* [Matt Brown]> With backed up files consisting of hard links, I usually use dd to copy > the file systems at the block level > > # dd if=/dev/sda of=/dev/sdb bs=20M > > and then expand the file system. This is because I found that tools like > rsync, while usually fast, are extremely slow when dealing with millions > of hard linked files. > > This could also be used for btrfs to keep its snapshots.If you can (temporarily) attach the old and new drives to the same computer, putting the ext4 BackupPC store on LVM and moving the LV around might be more convenient, or at least feel more "high level". For btrfs with lots of snapshots, I believe "btrfs device add" of the new device followed by "btrfs device remove" of the old one would be the most convenient. One advantage of using LVM and btrfs multi device support in this way is that the actual downtime is minimal -- you can keep the filesystems online. Even on cheap hardware, the only downtime should be to attach/remove disks. Øystein -- If it ain''t broke, don''t break it. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Oystein Viggen, Fri, 02 Jul 2010 08:15:03 +0200:> For btrfs with lots of snapshots, I believe "btrfs device add" of the > new device followed by "btrfs device remove" of the old one would be the > most convenient. > > ØysteinThis solution if very elegant and cool - if you can put the discs into one computer. It does not help too much to copy the files over network and preserve the snapshots... or can you add like this a network-attached device (sshfs) ? Lubos -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Saturday 03 July 2010 09:33:19 Lubos Kolouch wrote:> Oystein Viggen, Fri, 02 Jul 2010 08:15:03 +0200: > > For btrfs with lots of snapshots, I believe "btrfs device add" of the > > new device followed by "btrfs device remove" of the old one would be the > > most convenient. > > > > Øystein > > This solution if very elegant and cool - if you can put the discs into one > computer. > > It does not help too much to copy the files over network and preserve the > snapshots... or can you add like this a network-attached device (sshfs) ?You could also go the totally cool option (albeit a bit creazy) and use network block devices and have no downtime... The overall process will take more time though. -- Hubert Kario QBS - Quality Business Software ul. Ksawerów 30/85 02-656 Warszawa POLAND tel. +48 (22) 646-61-51, 646-74-24 fax +48 (22) 646-61-50 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html