I am trying to backup a large zfs file system to two different identical hard drives. I have therefore started two commands to backup "myfs" and when they have finished, I will backup "nextfs" zfs send mypool/myfs at now | zfs receive backupzpool1/now & zfs send mypool/myfs at now | zfs receive backupzpool2/now ; zfs send mypool/nextfs at now | zfs receive backupzpool3/now in parallell. The logic is that the same file data is cached and therefore easy to send to each backup drive. Should I instead have done one "zfs send..." and waited for it to complete, and then started the next? It seems that "zfs send..." takes quite some time? 300GB takes 10 hours, this far. And I have in total 3TB to backup. This means it will take 100 hours. Is this normal? If I had 30TB to back up, it would take 1000 hours, which is more than a month. Can I speed this up? Is rsync faster? As I have understood it, "zfs send.." gives me an exact replica, whereas rsync doesnt necessary do that, maybe the ACL are not replicated, etc. Is this correct about rsync vs "zfs send"? -- This message posted from opensolaris.org
Orvar Korvar wrote:> > It seems that "zfs send..." takes quite some time? 300GB takes 10 hours, this far. And I have in total 3TB to backup. This means it will take 100 hours. Is this normal? If I had 30TB to back up, it would take 1000 hours, which is more than a month. Can I speed this up? > >That looks very slow. How are the pools configured? Last time I migrated data between pools on the same box (an x4500), I got 80-90GB/hour.> Is rsync faster? As I have understood it, "zfs send.." gives me an exact replica, whereas rsync doesnt necessary do that, maybe the ACL are not replicated, etc. Is this correct about rsync vs "zfs send"? >I wouldn''t bother with rsync in this situation. -- Ian.
On Sun, Oct 25, 2009 at 01:45:05AM -0700, Orvar Korvar wrote:> I am trying to backup a large zfs file system to two different > identical hard drives. I have therefore started two commands to backup > "myfs" and when they have finished, I will backup "nextfs" > > zfs send mypool/myfs at now | zfs receive backupzpool1/now & zfs send > mypool/myfs at now | zfs receive backupzpool2/now ; zfs send > mypool/nextfs at now | zfs receive backupzpool3/now > > in parallell. The logic is that the same file data is cached and > therefore easy to send to each backup drive. > > Should I instead have done one "zfs send..." and waited for it to > complete, and then started the next? > > It seems that "zfs send..." takes quite some time? 300GB takes 10 > hours, this far. And I have in total 3TB to backup. This means it will > take 100 hours. Is this normal? If I had 30TB to back up, it would > take 1000 hours, which is more than a month. Can I speed this up?It''s not immediately obvious what the cause is. Maybe the server running zfs send has slow MB/s performance reading from disk. Maybe the network. Or maybe the remote system. This might help: http://tinyurl.com/yl653am -- albert chin (china at thewrittenword.com)
On Oct 25, 2009, at 1:45 AM, Orvar Korvar wrote:> I am trying to backup a large zfs file system to two different > identical hard drives. I have therefore started two commands to > backup "myfs" and when they have finished, I will backup "nextfs" > > zfs send mypool/myfs at now | zfs receive backupzpool1/now & zfs send > mypool/myfs at now | zfs receive backupzpool2/now ; zfs send mypool/ > nextfs at now | zfs receive backupzpool3/now > > in parallell. The logic is that the same file data is cached and > therefore easy to send to each backup drive. > > Should I instead have done one "zfs send..." and waited for it to > complete, and then started the next?Parallel works, well, in parallel. Unless the changes are in the ARC, you will be spending a lot of time waiting on disk. So having multiple sends in parallel, in general, gains parallelism. If you only have a single HDD, you might not notice much improvement, though.> It seems that "zfs send..." takes quite some time? 300GB takes 10 > hours, this far. And I have in total 3TB to backup. This means it > will take 100 hours. Is this normal? If I had 30TB to back up, it > would take 1000 hours, which is more than a month. Can I speed this > up?CR 6418042 integrated in b102 and Solaris 10 10/09 improves send performance.> Is rsync faster? As I have understood it, "zfs send.." gives me an > exact replica, whereas rsync doesnt necessary do that, maybe the ACL > are not replicated, etc. Is this correct about rsync vs "zfs send"?I general, rsync will be slower, especially if there are millions of files, because it must stat() every file to determine those that have changed. -- richard
knatte_fnatte_tjatte at yahoo.com said:> Is rsync faster? As I have understood it, "zfs send.." gives me an exact > replica, whereas rsync doesnt necessary do that, maybe the ACL are not > replicated, etc. Is this correct about rsync vs "zfs send"?It is true that rsync (as of 3.0.5, anyway) does not preserve NFSv4/ZFS ACL''s. It also cannot handle ZFS snapshots. On the other hand, you can run multiple rsync''s in parallel; You can only do that with zfs send/recv if you have multiple, independent ZFS datasets that can be done in parallel. So which one goes faster will depend on your situation. Regards, Marion
On Oct 26, 2009, at 11:51 AM, Marion Hakanson wrote:> knatte_fnatte_tjatte at yahoo.com said: >> Is rsync faster? As I have understood it, "zfs send.." gives me an >> exact >> replica, whereas rsync doesnt necessary do that, maybe the ACL are >> not >> replicated, etc. Is this correct about rsync vs "zfs send"? > > It is true that rsync (as of 3.0.5, anyway) does not preserve NFSv4/ > ZFS > ACL''s. It also cannot handle ZFS snapshots. > > On the other hand, you can run multiple rsync''s in parallel; You can > only do that with zfs send/recv if you have multiple, independent ZFS > datasets that can be done in parallel. So which one goes faster will > depend on your situation.Yes. Your configuration and intended use impacts the decision. Also, b119 improves stat() performance, which should help rsync and other file-based backup software. http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6775100 -- richard
On Sun, October 25, 2009 03:45, Orvar Korvar wrote:> It seems that "zfs send..." takes quite some time? 300GB takes 10 hours, > this far. And I have in total 3TB to backup. This means it will take 100 > hours. Is this normal? If I had 30TB to back up, it would take 1000 hours, > which is more than a month. Can I speed this up?That seems pretty bad, I back up around 650GB to a USB-2 external drive (not the world''s fastest!!!) in about 7 hours, last I checked the time.> > Is rsync faster? As I have understood it, "zfs send.." gives me an exact > replica, whereas rsync doesnt necessary do that, maybe the ACL are not > replicated, etc. Is this correct about rsync vs "zfs send"?rsync doesn''t seem to cover the ACLs, last I looked closely. Which is why I converted my working rsync-based version to a zfs send-receive version that only supports full backups and isn''t automated yet. Grumble. (The ACLs are key for in-kernel CIFS, which is what drove me here.) -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
TO CONCLUDE: Ok, it seems that I have misjudged the speed of zfs send, and I remember wrongly. This is because of the large amounts of data I transfered, and it felt like it took forever. But now I clocked and 239GB is transfered in about one hour (from 5 disc raidz1 onto a single disc), giving me ~70MB/sec. This must be considered acceptable. So, forget this thread. I should delete it. Everything is fine. -- This message posted from opensolaris.org