Richard Elling
2009-Jan-26  05:25 UTC
[zfs-discuss] thoughts on parallel backups, rsync, and send/receive
Recently, I''ve been working on a project which had agressive backup requirements. I believe we solved the problem with parallelism. You might consider doing the same. If you get time to do your own experiments, please share your observations with the community. http://richardelling.blogspot.com/2009/01/parallel-zfs-sendreceive.html -- richard
Ian Collins
2009-Jan-26  07:16 UTC
[zfs-discuss] thoughts on parallel backups, rsync, and send/receive
Richard Elling wrote:> Recently, I''ve been working on a project which had agressive backup > requirements. I believe we solved the problem with parallelism. You > might consider doing the same. If you get time to do your own experiments, > please share your observations with the community. > http://richardelling.blogspot.com/2009/01/parallel-zfs-sendreceive.html >You raise some interesting points about rsync getting bogged down over time. I have been working with a client with a requirement for replication between a number of hosts and I have found doing several rend/receives made quite an impact. What I haven''t done is try this with the latest performance improvements in b105. Have you? My guess is the gain will be less. One thing I have yet to do is find the optimum number of parallel transfers when there are 100s of filesystems. I''m looking into making this dynamic, based on throughput. Are you working with OpenSolaris? I still haven''t managed to nail the toxic streams problem in Solaris 10, which have curtailed my project. -- Ian.
Ahmed Kamal
2009-Jan-26  09:45 UTC
[zfs-discuss] thoughts on parallel backups, rsync, and send/receive
Did anyone share a script to send/recv zfs filesystems tree in parallel, especially if a cap on concurrency can be specified? Richard, how fast were you taking those snapshots, how fast were the syncs over the network. For example, assuming a snapshot every 10mins, is it reasonable to expect to sync every snapshot as they''re created every 10 mins. What would be the limit trying to lower those 10mins even more Is it catastrophic if a second zfs send launches, while an older one is still being run Regards On Mon, Jan 26, 2009 at 9:16 AM, Ian Collins <ian at ianshome.com> wrote:> Richard Elling wrote: >> Recently, I''ve been working on a project which had agressive backup >> requirements. I believe we solved the problem with parallelism. You >> might consider doing the same. If you get time to do your own experiments, >> please share your observations with the community. >> http://richardelling.blogspot.com/2009/01/parallel-zfs-sendreceive.html >> > > You raise some interesting points about rsync getting bogged down over > time. I have been working with a client with a requirement for > replication between a number of hosts and I have found doing several > rend/receives made quite an impact. What I haven''t done is try this > with the latest performance improvements in b105. Have you? My guess > is the gain will be less. > > One thing I have yet to do is find the optimum number of parallel > transfers when there are 100s of filesystems. I''m looking into making > this dynamic, based on throughput. > > Are you working with OpenSolaris? I still haven''t managed to nail the > toxic streams problem in Solaris 10, which have curtailed my project. > > -- > Ian. > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Richard Elling
2009-Jan-26  17:12 UTC
[zfs-discuss] thoughts on parallel backups, rsync, and send/receive
Ian Collins wrote:> Richard Elling wrote: > >> Recently, I''ve been working on a project which had agressive backup >> requirements. I believe we solved the problem with parallelism. You >> might consider doing the same. If you get time to do your own experiments, >> please share your observations with the community. >> http://richardelling.blogspot.com/2009/01/parallel-zfs-sendreceive.html >> >> > > You raise some interesting points about rsync getting bogged down over > time. I have been working with a client with a requirement for > replication between a number of hosts and I have found doing several > rend/receives made quite an impact. What I haven''t done is try this > with the latest performance improvements in b105. Have you? My guess > is the gain will be less. >Unfortunately, the rig was constrained to Solaris 10 10/08, so I don''t have any data on this for OpenSolaris.> One thing I have yet to do is find the optimum number of parallel > transfers when there are 100s of filesystems. I''m looking into making > this dynamic, based on throughput. >I''m not convinced that a throughput throttle or metric will be meaningful. I believe this will need to be iop-based.> Are you working with OpenSolaris? I still haven''t managed to nail the > toxic streams problem in Solaris 10, which have curtailed my project. >I am aware of the bug, but have not seen it. Murphy''s Law says it won''t happen until we roll into production :-( -- richard -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090126/fb542f75/attachment.html>
Richard Elling
2009-Jan-26  17:17 UTC
[zfs-discuss] thoughts on parallel backups, rsync, and send/receive
Ahmed Kamal wrote:> Did anyone share a script to send/recv zfs filesystems tree in > parallel, especially if a cap on concurrency can be specified? > Richard, how fast were you taking those snapshots, how fast were the > syncs over the network. For example, assuming a snapshot every 10mins, > is it reasonable to expect to sync every snapshot as they''re created > every 10 mins. What would be the limit trying to lower those 10mins > even more >We were snapping every hour with send/receive times on the order of 25 minutes. I do not believe there will be time to experiment with other combinations.> Is it catastrophic if a second zfs send launches, while an older one > is still being run >I use a semaphore property to help avoid this, by design. That said, I have not tried to see if there is a lurking bug with ZFS receive that would need to be fixed if it cannot handle concurrent receives. My send/receive script will incrementally copy from the latest, common snapshot to the latest snapshot. For rsync, it will sync from the epoch to the latest snapshot. -- richard
Ian Collins
2009-Jan-26  20:51 UTC
[zfs-discuss] thoughts on parallel backups, rsync, and send/receive
Richard Elling wrote:> Ian Collins wrote: > >> One thing I have yet to do is find the optimum number of parallel >> transfers when there are 100s of filesystems. I''m looking into making >> this dynamic, based on throughput. >> > > I''m not convinced that a throughput throttle or metric will be > meaningful. I believe this will need to be iop-based. >OK, I''ll check. I was looking at adding jibs until the average send time declined.>> Are you working with OpenSolaris? I still haven''t managed to nail the >> toxic streams problem in Solaris 10, which have curtailed my project. >> > > I am aware of the bug, but have not seen it. Murphy''s Law says it won''t > happen until we roll into production :-(How many file systems do you have? I hit the problem about 1 in 1500 send/receives. The last time was with a 1TB filesystem with about 600GB of snaps, so I couldn''t attach it to the bug! -- Ian.