David Dyer-Bennet
2010-Mar-05 15:32 UTC
[zfs-discuss] ZFS replication send/receive errors out
My full backup script errorred out the last two times I ran it. I''ve got a full Bash trace of it, so I know exactly what was done. There are a moderate number of snapshots on the zp1 pool, and I''m intending to replicate the whole thing into the backup pool. After housekeeping, I take make a current snapshot on the data pool (zp1). Since this is a new full backup, I then destroy the data on the backup pool (bup-ruin). Then I create the appropriate backup filesystem (bup-ruin/fsfs/zp1), and do a full replication send and receive. (They''re piped together; Bash trace output shows the commands separately below.) + zfs snapshot -r zp1 at bup-20100303-130903GMT + zfs destroy -rf bup-ruin/fsfs/zp1 + zfs create -p bup-ruin/fsfs/zp1 + zfs send -Rv zp1 at bup-20100303-130903GMT + zfs recv -Fudv bup-ruin/fsfs/zp1 sending from @ to zp1 at bup-20090223-033745UTC So it starts sending, and receiving (not shown), and this goes on for a long while, and eventually (I left in a number of the late snapshots being transferred) sending from @bup-4hr-20100227-040000CST to zp1/lydy at bup-4hr-20100227-080000CST sending from @bup-4hr-20100227-080000CST to zp1/lydy at bup-daily-20100227-100000CST sending from @bup-daily-20100227-100000CST to zp1/lydy at bup-4hr-20100227-120000CST sending from @bup-4hr-20100227-120000CST to zp1/lydy at bup-4hr-20100227-160000CST sending from @bup-4hr-20100227-160000CST to zp1/lydy at bup-4hr-20100227-200000CST sending from @bup-4hr-20100227-200000CST to zp1/lydy at bup-4hr-20100228-000000CST received 271KB stream in 1 seconds (271KB/sec) sending from @bup-4hr-20100228-000000CST to zp1/lydy at bup-4hr-20100228-040000CST receiving incremental stream of zp1/lydy at bup-monthly-20100222-210000CST into bup-r\ uin/fsfs/zp1/lydy at bup-monthly-20100222-210000CST received 312B stream in 1 seconds (312B/sec) receiving incremental stream of zp1/lydy at bup-daily-20100222-210000CST into bup-rui\ n/fsfs/zp1/lydy at bup-daily-20100222-210000CST sending from @bup-4hr-20100228-040000CST to zp1/lydy at bup-4hr-20100228-080000CST received 312B stream in 1 seconds (312B/sec) receiving incremental stream of zp1/lydy at bup-4hr-20100224-120000CST into bup-ruin/\ fsfs/zp1/lydy at bup-4hr-20100224-120000CST sending from @bup-4hr-20100228-080000CST to zp1/lydy at bup-daily-20100228-100000CST cannot receive incremental stream: most recent snapshot of bup-ruin/fsfs/zp1/lydy \ does not match incremental source This badly breaks my understanding of what zfs send/receive does. I don''t see how this error is a possible outcome of the pair of commands I gave. real 59m8.057s user 0m0.537s sys 5m51.895s It ran about an hour. A full backup that succeeds takes about 6 hours (the backup pools are external USB drives, not so fast). bash-3.2$ zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT bup-ruin 928G 63.7G 864G 6% ONLINE /backups/bup-ruin rpool 149G 6.41G 143G 4% ONLINE - zp1 1.09T 637G 479G 57% ONLINE - As you can see, it was nowhere near finished. But nothing was full, nothing crashed. Anybody got a spare clue? It shouldn''t, as I understand it, be possible for a full replication stream going into a newly-created filesystem to get this error. -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
Richard Elling
2010-Mar-07 20:08 UTC
[zfs-discuss] ZFS replication send/receive errors out
On Mar 5, 2010, at 7:32 AM, David Dyer-Bennet wrote:> My full backup script errorred out the last two times I ran it. I''ve got > a full Bash trace of it, so I know exactly what was done. > > There are a moderate number of snapshots on the zp1 pool, and I''m > intending to replicate the whole thing into the backup pool. > > After housekeeping, I take make a current snapshot on the data pool (zp1). > Since this is a new full backup, I then destroy the data on the backup > pool (bup-ruin). Then I create the appropriate backup filesystem > (bup-ruin/fsfs/zp1), and do a full replication send and receive. (They''re > piped together; Bash trace output shows the commands separately below.) > > + zfs snapshot -r zp1 at bup-20100303-130903GMT > + zfs destroy -rf bup-ruin/fsfs/zp1 > + zfs create -p bup-ruin/fsfs/zp1 > + zfs send -Rv zp1 at bup-20100303-130903GMT > + zfs recv -Fudv bup-ruin/fsfs/zp1 > sending from @ to zp1 at bup-20090223-033745UTC > > So it starts sending, and receiving (not shown), and this goes on for a > long while, and eventually (I left in a number of the late snapshots being > transferred) > > sending from @bup-4hr-20100227-040000CST to > zp1/lydy at bup-4hr-20100227-080000CST > sending from @bup-4hr-20100227-080000CST to > zp1/lydy at bup-daily-20100227-100000CST > sending from @bup-daily-20100227-100000CST to > zp1/lydy at bup-4hr-20100227-120000CST > sending from @bup-4hr-20100227-120000CST to > zp1/lydy at bup-4hr-20100227-160000CST > sending from @bup-4hr-20100227-160000CST to > zp1/lydy at bup-4hr-20100227-200000CST > sending from @bup-4hr-20100227-200000CST to > zp1/lydy at bup-4hr-20100228-000000CST > received 271KB stream in 1 seconds (271KB/sec) > sending from @bup-4hr-20100228-000000CST to > zp1/lydy at bup-4hr-20100228-040000CST > receiving incremental stream of zp1/lydy at bup-monthly-20100222-210000CST > into bup-r\ > uin/fsfs/zp1/lydy at bup-monthly-20100222-210000CST > received 312B stream in 1 seconds (312B/sec) > receiving incremental stream of zp1/lydy at bup-daily-20100222-210000CST into > bup-rui\ > n/fsfs/zp1/lydy at bup-daily-20100222-210000CST > sending from @bup-4hr-20100228-040000CST to > zp1/lydy at bup-4hr-20100228-080000CST > received 312B stream in 1 seconds (312B/sec) > receiving incremental stream of zp1/lydy at bup-4hr-20100224-120000CST into > bup-ruin/\ > fsfs/zp1/lydy at bup-4hr-20100224-120000CST > sending from @bup-4hr-20100228-080000CST to > zp1/lydy at bup-daily-20100228-100000CST > cannot receive incremental stream: most recent snapshot of > bup-ruin/fsfs/zp1/lydy \ > does not > match incremental sourceThis can happen if the auto-snapshot (aka Time Slider) service creates a snapshot of the receiving dataset. -- richard> > This badly breaks my understanding of what zfs send/receive does. I don''t > see how this error is a possible outcome of the pair of commands I gave. > > real 59m8.057s > user 0m0.537s > sys 5m51.895s > > It ran about an hour. A full backup that succeeds takes about 6 hours > (the backup pools are external USB drives, not so fast). > > bash-3.2$ zpool list > NAME SIZE USED AVAIL CAP HEALTH ALTROOT > bup-ruin 928G 63.7G 864G 6% ONLINE /backups/bup-ruin > rpool 149G 6.41G 143G 4% ONLINE - > zp1 1.09T 637G 479G 57% ONLINE - > > As you can see, it was nowhere near finished. But nothing was full, > nothing crashed. > > Anybody got a spare clue? It shouldn''t, as I understand it, be possible > for a full replication stream going into a newly-created filesystem to get > this error. > > -- > David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ > Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ > Photos: http://dd-b.net/photography/gallery/ > Dragaera: http://dragaera.info > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discussZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance http://nexenta-atlanta.eventbrite.com (March 16-18, 2010)
David Dyer-Bennet
2010-Mar-08 03:18 UTC
[zfs-discuss] ZFS replication send/receive errors out
On 3/7/2010 2:08 PM, Richard Elling wrote:> On Mar 5, 2010, at 7:32 AM, David Dyer-Bennet wrote: > >> sending from @bup-4hr-20100228-040000CST to >> zp1/lydy at bup-4hr-20100228-080000CST >> received 312B stream in 1 seconds (312B/sec) >> receiving incremental stream of zp1/lydy at bup-4hr-20100224-120000CST into >> bup-ruin/\ >> fsfs/zp1/lydy at bup-4hr-20100224-120000CST >> sending from @bup-4hr-20100228-080000CST to >> zp1/lydy at bup-daily-20100228-100000CST >> cannot receive incremental stream: most recent snapshot of >> bup-ruin/fsfs/zp1/lydy \ >> does not >> match incremental source >> > This can happen if the auto-snapshot (aka Time Slider) service creates a > snapshot of the receiving dataset. >I don''t think that''s in 2009.06, is it? On by default? And I don''t recall installing it as any kind of addon. Is there some obvious way to check? Oh, one thing is, snapshots I''m not expecting aren''t appearing. Since I''m working on this software, I look at lists of snapshots pretty often. Of course it wouldn''t be specific to auto-snapshot; presumably anything that created a snapshot of the receiving dataset? But I can''t think of anything I have that would do that. I have my own service, but it''s quite specific about what it does. Well, it''s something I can look for. Thanks for the suggestion! -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info