David Dyer-Bennet
2010-Aug-10 15:38 UTC
[zfs-discuss] Problems with big ZFS send/receive in b134
My full backup still doesn''t complete. However, instead of hanging the entire disk subsystem as it did on 111b, it now issues error messages. Errors at the end. sending from @bup-daily-20100726-100000CDT to zp1/ddb at bup-daily-20100727-100000CDT received 3.80GB stream in 136 seconds (28.6MB/sec) receiving incremental stream of zp1/ddb at bup-daily-20100727-100000CDT into bup-wrack/fsfs/z p1/ddb at bup-daily-20100727-100000CDT sending from @bup-daily-20100727-100000CDT to zp1/ddb at bup-daily-20100728-100001CDT received 192MB stream in 10 seconds (19.2MB/sec) receiving incremental stream of zp1/ddb at bup-daily-20100728-100001CDT into bup-wrack/fsfs/z p1/ddb at bup-daily-20100728-100001CDT sending from @bup-daily-20100728-100001CDT to zp1/ddb at bup-daily-20100729-100000CDT received 170MB stream in 9 seconds (18.9MB/sec) receiving incremental stream of zp1/ddb at bup-daily-20100729-100000CDT into bup-wrack/fsfs/z p1/ddb at bup-daily-20100729-100000CDT sending from @bup-daily-20100729-100000CDT to zp1/ddb at bup-2hr-20100729-220000CDT warning: cannot send ''zp1/ddb at bup-2hr-20100729-220000CDT'': no such pool or dataset sending from @bup-2hr-20100729-220000CDT to zp1/ddb at bup-2hr-20100730-000000CDT warning: cannot send ''zp1/ddb at bup-2hr-20100730-000000CDT'': no such pool or dataset sending from @bup-2hr-20100730-000000CDT to zp1/ddb at bup-2hr-20100730-020000CDT warning: cannot send ''zp1/ddb at bup-2hr-20100730-020000CDT'': no such pool or dataset sending from @bup-2hr-20100730-020000CDT to zp1/ddb at bup-2hr-20100730-040000CDT warning: cannot send ''zp1/ddb at bup-2hr-20100730-040000CDT'': incremental source (@bup-2hr-20 100730-020000CDT) does not exist sending from @bup-2hr-20100730-040000CDT to zp1/ddb at bup-2hr-20100730-060000CDT sending from @bup-2hr-20100730-060000CDT to zp1/ddb at bup-2hr-20100730-080000CDT sending from @bup-2hr-20100730-080000CDT to zp1/ddb at bup-daily-20100730-100000CDT sending from @bup-daily-20100730-100000CDT to zp1/ddb at bup-2hr-20100730-100000CDT sending from @bup-2hr-20100730-100000CDT to zp1/ddb at bup-2hr-20100730-120000CDT sending from @bup-2hr-20100730-120000CDT to zp1/ddb at bup-2hr-20100730-140000CDT sending from @bup-2hr-20100730-140000CDT to zp1/ddb at bup-2hr-20100730-160000CDT sending from @bup-2hr-20100730-160000CDT to zp1/ddb at bup-2hr-20100730-180000CDT sending from @bup-2hr-20100730-180000CDT to zp1/ddb at bup-2hr-20100730-200000CDT sending from @bup-2hr-20100730-200000CDT to zp1/ddb at bup-2hr-20100730-220000CDT received 162MB stream in 9 seconds (18.0MB/sec) receiving incremental stream of zp1/ddb at bup-2hr-20100730-060000CDT into bup-wrack/fsfs/zp1 /ddb at bup-2hr-20100730-060000CDT cannot receive incremental stream: most recent snapshot of bup-wrack/fsfs/zp1/ddb does not match incremental source bash-4.0$ The bup-wrack pool was newly-created, empty, before this backup started. The backup commands were: zfs send -Rv "$srcsnap" | zfs recv -Fudv "$BUPPOOL/$HOSTNAME/$FS" I don''t see how anything could be creating snapshots on bup-wrack while this was running. That pool is not normally mounted (it''s on a single external USB drive, I plug it in for backups). My script for doing regular snapshots of zp1 and rpool doesn''t reference any of the bup-* pools. I don''t see how this snapshot mismatch can be coming from anything but the send/receive process. There are quite a lot of snapshots; dailys for some months, 2-hour ones for a couple of weeks. Most of them are empty or tiny. Next time I will try WITHOUT -v on both ends, and arrange to capture the expanded version of the command with all the variables filled in, but I don''t expect any different outcome. Any other ideas? -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
David Dyer-Bennet
2010-Aug-10 15:58 UTC
[zfs-discuss] Problems with big ZFS send/receive in b134
Additional information. I started another run, and captured the exact expanded commands. These SHOULD BE the exact commands used in the last run except for the snapshot name (this script makes a recursive snapshot just before it starts a backup). In any case they ARE the exact commands used in this new run, and we''ll see what happens at the end of this run. (These are from a bash trace as produced by "set -x") + zfs create -p bup-wrack/fsfs/zp1 + zfs send -Rp zp1 at bup-20100810-154542GMT + zfs recv -Fud bup-wrack/fsfs/zp1 (The send and the receive are source and sink in a pipeline). As you can see, the destination filesystem is new in the bup-wrack pool. The "-R" on the send should, as I understand it, create a replication stream which will "replicate the specified filesystem, and all descendent file systems, up to the named snapshot. When received, all properties, snapshots, descendent file systems, and clones are preserved." This should send the full state of zp1 up to the snapshot. And the receive should receive it into bup-wrack/fsfs/zp1.) Isn''t this how a "full backup" should be made using zfs send/receive? (Once this is working, I think intend to use -I to send incremental streams to update it regularly.) bash-4.0$ zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT bup-wrack 928G 4.62G 923G 0% 1.00x ONLINE /backups/bup-wrack rpool 149G 10.0G 139G 6% 1.00x ONLINE - zp1 1.09T 743G 373G 66% 1.00x ONLINE - zp1 is my primary data pool. It''s not very big (physically it''s 3 2-way mirrors of 400GB drives). It has 743G of data in it. bup-wrack is the backup pool, it''s a single 1TB external USB drive. This was taken shortly after starting the second try at a full backup (since the b134 upgrade), so bup-wrack is still mostly empty. None of the pools have shown any errors of any sort in months. zp1 and rpool are scrubbed weekly. -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
Dave Pacheco
2010-Aug-10 18:23 UTC
[zfs-discuss] Problems with big ZFS send/receive in b134
David Dyer-Bennet wrote:> My full backup still doesn''t complete. However, instead of hanging the > entire disk subsystem as it did on 111b, it now issues error messages. > Errors at the end.[...]> cannot receive incremental stream: most recent snapshot of > bup-wrack/fsfs/zp1/ddb does not > match incremental source > bash-4.0$ > > The bup-wrack pool was newly-created, empty, before this backup started. > > The backup commands were: > > zfs send -Rv "$srcsnap" | zfs recv -Fudv "$BUPPOOL/$HOSTNAME/$FS" > > I don''t see how anything could be creating snapshots on bup-wrack while > this was running. That pool is not normally mounted (it''s on a single > external USB drive, I plug it in for backups). My script for doing > regular snapshots of zp1 and rpool doesn''t reference any of the bup-* > pools. > > I don''t see how this snapshot mismatch can be coming from anything but the > send/receive process. > > There are quite a lot of snapshots; dailys for some months, 2-hour ones > for a couple of weeks. Most of them are empty or tiny. > > Next time I will try WITHOUT -v on both ends, and arrange to capture the > expanded version of the command with all the variables filled in, but I > don''t expect any different outcome. > > Any other ideas?Is it possible that snapshots were renamed on the sending pool during the send operation? -- Dave -- David Pacheco, Sun Microsystems Fishworks. http://blogs.sun.com/dap/
David Dyer-Bennet
2010-Aug-10 18:46 UTC
[zfs-discuss] Problems with big ZFS send/receive in b134
On Tue, August 10, 2010 13:23, Dave Pacheco wrote:> David Dyer-Bennet wrote: >> My full backup still doesn''t complete. However, instead of hanging the >> entire disk subsystem as it did on 111b, it now issues error messages. >> Errors at the end. > [...] >> cannot receive incremental stream: most recent snapshot of >> bup-wrack/fsfs/zp1/ddb does not >> match incremental source >> bash-4.0$ >> >> The bup-wrack pool was newly-created, empty, before this backup started. >> >> The backup commands were: >> >> zfs send -Rv "$srcsnap" | zfs recv -Fudv "$BUPPOOL/$HOSTNAME/$FS" >> >> I don''t see how anything could be creating snapshots on bup-wrack while >> this was running. That pool is not normally mounted (it''s on a single >> external USB drive, I plug it in for backups). My script for doing >> regular snapshots of zp1 and rpool doesn''t reference any of the bup-* >> pools. >> >> I don''t see how this snapshot mismatch can be coming from anything but >> the >> send/receive process. >> >> There are quite a lot of snapshots; dailys for some months, 2-hour ones >> for a couple of weeks. Most of them are empty or tiny. >> >> Next time I will try WITHOUT -v on both ends, and arrange to capture the >> expanded version of the command with all the variables filled in, but I >> don''t expect any different outcome. >> >> Any other ideas? > > > Is it possible that snapshots were renamed on the sending pool during > the send operation?I don''t have any scripts that rename a snapshot (in fact I didn''t know it was possible until just now), and I don''t have other users with permission to make snapshots (either delegated or by root access). I''m not using the Sun auto-snapshot thing, I''ve got a much-simpler script of my own (hence I know what it does). So I don''t at the moment see how one would be getting renamed. It''s possible that a snapshot was *deleted* on the sending pool during the send operation, however. Also that snapshots were created (however, a newly created one would be after the one specified in the zfs send -R, and hence should be irrelevant). (In fact it''s certain that snapshots were created and I''m nearly certain of deleted.) If that turns out to be the problem, that''ll be annoying to work around (I''m making snapshots every two hours and deleting them after a couple of weeks). Locks between admin scripts rarely end well, in my experience. But at least I''d know what I had to work around. Am I looking for too much here? I *thought* I was doing something that should be simple and basic and frequently used nearly everywhere, and hence certain to work. "What could go wrong?", I thought :-). If I''m doing something inherently dicey I can try to find a way to back off; as my primary backup process, this needs to be rock-solid. -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
Dave Pacheco
2010-Aug-10 21:41 UTC
[zfs-discuss] Problems with big ZFS send/receive in b134
David Dyer-Bennet wrote:> On Tue, August 10, 2010 13:23, Dave Pacheco wrote: >> David Dyer-Bennet wrote: >>> My full backup still doesn''t complete. However, instead of hanging the >>> entire disk subsystem as it did on 111b, it now issues error messages. >>> Errors at the end. >> [...] >>> cannot receive incremental stream: most recent snapshot of >>> bup-wrack/fsfs/zp1/ddb does not >>> match incremental source >>> bash-4.0$ >>> >>> The bup-wrack pool was newly-created, empty, before this backup started. >>> >>> The backup commands were: >>> >>> zfs send -Rv "$srcsnap" | zfs recv -Fudv "$BUPPOOL/$HOSTNAME/$FS" >>> >>> I don''t see how anything could be creating snapshots on bup-wrack while >>> this was running. That pool is not normally mounted (it''s on a single >>> external USB drive, I plug it in for backups). My script for doing >>> regular snapshots of zp1 and rpool doesn''t reference any of the bup-* >>> pools. >>> >>> I don''t see how this snapshot mismatch can be coming from anything but >>> the >>> send/receive process. >>> >>> There are quite a lot of snapshots; dailys for some months, 2-hour ones >>> for a couple of weeks. Most of them are empty or tiny. >>> >>> Next time I will try WITHOUT -v on both ends, and arrange to capture the >>> expanded version of the command with all the variables filled in, but I >>> don''t expect any different outcome. >>> >>> Any other ideas? >> >> Is it possible that snapshots were renamed on the sending pool during >> the send operation? > > I don''t have any scripts that rename a snapshot (in fact I didn''t know it > was possible until just now), and I don''t have other users with permission > to make snapshots (either delegated or by root access). I''m not using the > Sun auto-snapshot thing, I''ve got a much-simpler script of my own (hence I > know what it does). So I don''t at the moment see how one would be getting > renamed. > > It''s possible that a snapshot was *deleted* on the sending pool during the > send operation, however. Also that snapshots were created (however, a > newly created one would be after the one specified in the zfs send -R, and > hence should be irrelevant). (In fact it''s certain that snapshots were > created and I''m nearly certain of deleted.) > > If that turns out to be the problem, that''ll be annoying to work around > (I''m making snapshots every two hours and deleting them after a couple of > weeks). Locks between admin scripts rarely end well, in my experience. > But at least I''d know what I had to work around. > > Am I looking for too much here? I *thought* I was doing something that > should be simple and basic and frequently used nearly everywhere, and > hence certain to work. "What could go wrong?", I thought :-). If I''m > doing something inherently dicey I can try to find a way to back off; as > my primary backup process, this needs to be rock-solid.It''s certainly a reasonable thing to do and it should work. There have been a few problems around deleting and renaming snapshots as they''re being sent, but the delete issues were fixed in build 123 by having zfs_send hold snapshots being sent (as long as you''ve upgraded your pool past version 18), and it sounds like you''re not doing renames, so your problem may be unrelated. -- Dave -- David Pacheco, Sun Microsystems Fishworks. http://blogs.sun.com/dap/
David Dyer-Bennet
2010-Aug-11 03:45 UTC
[zfs-discuss] Problems with big ZFS send/receive in b134
On 10-Aug-10 13:46, David Dyer-Bennet wrote:> > On Tue, August 10, 2010 13:23, Dave Pacheco wrote: >> David Dyer-Bennet wrote: >>> My full backup still doesn''t complete. However, instead of hanging the >>> entire disk subsystem as it did on 111b, it now issues error messages. >>> Errors at the end. >> [...] >>> cannot receive incremental stream: most recent snapshot of >>> bup-wrack/fsfs/zp1/ddb does not >>> match incremental source >>> bash-4.0$ >>> >>> The bup-wrack pool was newly-created, empty, before this backup started. >>> >>> The backup commands were: >>> >>> zfs send -Rv "$srcsnap" | zfs recv -Fudv "$BUPPOOL/$HOSTNAME/$FS" >>> >>> I don''t see how anything could be creating snapshots on bup-wrack while >>> this was running. That pool is not normally mounted (it''s on a single >>> external USB drive, I plug it in for backups). My script for doing >>> regular snapshots of zp1 and rpool doesn''t reference any of the bup-* >>> pools. >>> >>> I don''t see how this snapshot mismatch can be coming from anything but >>> the >>> send/receive process. >>> >>> There are quite a lot of snapshots; dailys for some months, 2-hour ones >>> for a couple of weeks. Most of them are empty or tiny. >>> >>> Next time I will try WITHOUT -v on both ends, and arrange to capture the >>> expanded version of the command with all the variables filled in, but I >>> don''t expect any different outcome. >>> >>> Any other ideas? >> >> >> Is it possible that snapshots were renamed on the sending pool during >> the send operation? > > I don''t have any scripts that rename a snapshot (in fact I didn''t know it > was possible until just now), and I don''t have other users with permission > to make snapshots (either delegated or by root access). I''m not using the > Sun auto-snapshot thing, I''ve got a much-simpler script of my own (hence I > know what it does). So I don''t at the moment see how one would be getting > renamed. > > It''s possible that a snapshot was *deleted* on the sending pool during the > send operation, however. Also that snapshots were created (however, a > newly created one would be after the one specified in the zfs send -R, and > hence should be irrelevant). (In fact it''s certain that snapshots were > created and I''m nearly certain of deleted.)More information. The test I started this morning errored out somewhat similarly, and one set of errors is clearly deleted snapshots (they''re 2hr snapshots that some of get deleted every 2 hours). There are also errors relating to "incremental streams" which is strange since I''m not using -I or -i at all. Here are the commands again, and all the output. + zfs create -p bup-wrack/fsfs/zp1 + zfs send -Rp zp1 at bup-20100810-154542GMT + zfs recv -Fud bup-wrack/fsfs/zp1 warning: cannot send ''zp1/ddb at bup-2hr-20100731-120000CDT'': no such pool or dataset warning: cannot send ''zp1/ddb at bup-2hr-20100731-140000CDT'': no such pool or dataset warning: cannot send ''zp1/ddb at bup-2hr-20100731-160000CDT'': no such pool or dataset warning: cannot send ''zp1/ddb at bup-20100731-213303GMT'': incremental source (@bup-2hr-20100731-160000CDT) does not exist warning: cannot send ''zp1/ddb at bup-2hr-20100731-180000CDT'': no such pool or dataset warning: cannot send ''zp1/ddb at bup-2hr-20100731-200000CDT'': incremental source (@bup-2hr-20100731-180000CDT) does not exist cannot receive incremental stream: most recent snapshot of bup-wrack/fsfs/zp1/ddb does not match incremental source Afterward, bash-4.0$ zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT bup-wrack 928G 687G 241G 73% 1.00x ONLINE /backups/bup-wrack rpool 149G 10.0G 139G 6% 1.00x ONLINE - zp1 1.09T 743G 373G 66% 1.00x ONLINE - So quite a lot did get transferred; but not all. So, it appears clear that snapshots being deleted during the zfs send -R causes a warning. A warning is fine, since they''re not there it can''t send them, and they were there when the command was given so it makes sense for it to try. That last message, which is not tagged as either warning or error, worries me though. And wondering how complete the transfer is; I believe the backup copy is compressed whereas the zp1 copy isn''t, so the ALLOC being that different isn''t clear-cut evidence of anything. I''ll try to guess a few things that should be recent and see if they in fact got into the backup. -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
Ian Collins
2010-Aug-11 04:13 UTC
[zfs-discuss] Problems with big ZFS send/receive in b134
On 08/11/10 03:45 PM, David Dyer-Bennet wrote:> On 10-Aug-10 13:46, David Dyer-Bennet wrote: >> >> It''s possible that a snapshot was *deleted* on the sending pool >> during the >> send operation, however. Also that snapshots were created (however, a >> newly created one would be after the one specified in the zfs send >> -R, and >> hence should be irrelevant). (In fact it''s certain that snapshots were >> created and I''m nearly certain of deleted.) > > More information. The test I started this morning errored out > somewhat similarly, and one set of errors is clearly deleted snapshots > (they''re 2hr snapshots that some of get deleted every 2 hours). There > are also errors relating to "incremental streams" which is strange > since I''m not using -I or -i at all. > > Here are the commands again, and all the output. > > + zfs create -p bup-wrack/fsfs/zp1 > + zfs send -Rp zp1 at bup-20100810-154542GMT > + zfs recv -Fud bup-wrack/fsfs/zp1 > warning: cannot send ''zp1/ddb at bup-2hr-20100731-120000CDT'': no such > pool or dataset > warning: cannot send ''zp1/ddb at bup-2hr-20100731-140000CDT'': no such > pool or dataset > warning: cannot send ''zp1/ddb at bup-2hr-20100731-160000CDT'': no such > pool or dataset > warning: cannot send ''zp1/ddb at bup-20100731-213303GMT'': incremental > source (@bup-2hr-20100731-160000CDT) does not exist > warning: cannot send ''zp1/ddb at bup-2hr-20100731-180000CDT'': no such > pool or dataset > warning: cannot send ''zp1/ddb at bup-2hr-20100731-200000CDT'': incremental > source (@bup-2hr-20100731-180000CDT) does not exist > cannot receive incremental stream: most recent snapshot of > bup-wrack/fsfs/zp1/ddb does not > match incremental source >That last error occurs if the snapshot exists, but has changed, it has been deleted and a new one with the same name created.> > That last message, which is not tagged as either warning or error, > worries me though. And wondering how complete the transfer is; I > believe the backup copy is compressed whereas the zp1 copy isn''t, so > the ALLOC being that different isn''t clear-cut evidence of anything. >It probably aborted the send. -- Ian.
David Dyer-Bennet
2010-Aug-11 14:21 UTC
[zfs-discuss] Problems with big ZFS send/receive in b134
On Tue, August 10, 2010 23:13, Ian Collins wrote:> On 08/11/10 03:45 PM, David Dyer-Bennet wrote:>> cannot receive incremental stream: most recent snapshot of >> bup-wrack/fsfs/zp1/ddb does not >> match incremental source> That last error occurs if the snapshot exists, but has changed, it has > been deleted and a new one with the same name created.So for testing purposes at least, I need to shut down everything I have that creates or deletes snapshots. (I don''t, though, have anything that would delete one and create one with the same name. I create snapshots with various names (2hr, daily, weekly, monthly, yearly) and a current timestamp, and I delete old ones (many days old at a minimum).) And I think I''ll abstract the commands from my backup script into a simpler dedicated test script, so I''m sure I''m doing exactly the same thing each time (that should cause me to hit on a combination that works right away :-) ). Is there anything stock in b134 that messes with snapshots that I should shut down to keep things stable, or am I only worried about my own stuff? Are other people out there not using send/receive for backups? Or not trying to preserve snapshots while doing it? Or, are you doing what I''m doing, and not having the problems I''m having? -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
David Dyer-Bennet
2010-Aug-11 14:36 UTC
[zfs-discuss] Problems with big ZFS send/receive in b134
On Tue, August 10, 2010 16:41, Dave Pacheco wrote:> David Dyer-Bennet wrote:>> If that turns out to be the problem, that''ll be annoying to work around >> (I''m making snapshots every two hours and deleting them after a couple >> of >> weeks). Locks between admin scripts rarely end well, in my experience. >> But at least I''d know what I had to work around. >> >> Am I looking for too much here? I *thought* I was doing something that >> should be simple and basic and frequently used nearly everywhere, and >> hence certain to work. "What could go wrong?", I thought :-). If I''m >> doing something inherently dicey I can try to find a way to back off; as >> my primary backup process, this needs to be rock-solid. > > > It''s certainly a reasonable thing to do and it should work. There have > been a few problems around deleting and renaming snapshots as they''re > being sent, but the delete issues were fixed in build 123 by having > zfs_send hold snapshots being sent (as long as you''ve upgraded your pool > past version 18), and it sounds like you''re not doing renames, so your > problem may be unrelated.AHA! You may have nailed the issue -- I''ve upgraded from 111b to 134, but have not yet upgraded my pool. Checking...yes, the pool I''m sending from is V14. (I don''t instantly upgrade pools; I need to preserve the option of falling back to older software for a while after an upgrade.) So, I should try either turning off my snapshot creator/deleter during the backup, or upgrade the pool. Will do! (I will eventually upgrade the pool of course, but I think I''ll try the more reversible option first. I can have the deleter check for the pid file the backup already creates to avoid two backups running at once.) Thank you very much! This is extremely encouraging. -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
On Wed, Aug 11, 2010 at 10:36 AM, David Dyer-Bennet <dd-b at dd-b.net> wrote:> > On Tue, August 10, 2010 16:41, Dave Pacheco wrote: >> David Dyer-Bennet wrote: > >>> If that turns out to be the problem, that''ll be annoying to work around >>> (I''m making snapshots every two hours and deleting them after a couple >>> of >>> weeks). ?Locks between admin scripts rarely end well, in my experience. >>> But at least I''d know what I had to work around.I''ve had good luck with locks (eventually), but they are not trivial if you want them to be robust. It usually takes a bunch of trial and error for me.>>> Am I looking for too much here? ?I *thought* I was doing something that >>> should be simple and basic and frequently used nearly everywhere, and >>> hence certain to work. ?"What could go wrong?", I thought :-). ?If I''m >>> doing something inherently dicey I can try to find a way to back off; as >>> my primary backup process, this needs to be rock-solid.It looks like you are trying to do a full send every time, what about a first full then incremental (which should be much faster) ? The first full might run afoul of the 2 hour snapshots (and deletions), but I would not expect the incremental to. I am syncing about 20 TB of data between sites this way every 4 hours over a 100 Mb link. I put the snapshot management and the site to site replication in the same script to keep them from fighting :-) -- {--------1---------2---------3---------4---------5---------6---------7---------} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, RPI Players
David Dyer-Bennet
2010-Aug-12 17:08 UTC
[zfs-discuss] Problems with big ZFS send/receive in b134
On Wed, August 11, 2010 15:11, Paul Kraus wrote:> On Wed, Aug 11, 2010 at 10:36 AM, David Dyer-Bennet <dd-b at dd-b.net> wrote:>>>> Am I looking for too much here? ?I *thought* I was doing something >>>> that >>>> should be simple and basic and frequently used nearly everywhere, and >>>> hence certain to work. ?"What could go wrong?", I thought :-). ?If I''m >>>> doing something inherently dicey I can try to find a way to back off; >>>> as >>>> my primary backup process, this needs to be rock-solid. > > It looks like you are trying to do a full send every time, what about > a first full then incremental (which should be much faster) ? The > first full might run afoul of the 2 hour snapshots (and deletions), > but I would not expect the incremental to. I am syncing about 20 TB of > data between sites this way every 4 hours over a 100 Mb link. I put > the snapshot management and the site to site replication in the same > script to keep them from fighting :-)What I''m working on is, in fact, the first backup. I intended from the start to use incrementals; they just didn''t work in earlier versions, and I was reduced to doing full backups only. And I need a successful full backup to start the series, and to initialize any new backup media, and so forth. So I think I have to solve this problem, even if most of the backups will be incrementals. Mostly the incrementals should be quite fast -- but I can come home from a weekend away with 30 GB or so of photos, which would appear on the server all at once. Still, that''s well under 2 hours. -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info