John Plocher
2007-May-25 22:28 UTC
[zfs-discuss] I seem to have backed myself into a corner - how do I migrate filesystems from one pool to another?
Thru a sequence of good intentions, I find myself with a raidz''d pool that has a failed drive that I can''t replace. We had a generous department donate a fully configured V440 for use as our departmental server. Of course, I installed SX/b56 on it, created a pool with 3x 148Gb drives and made a dozen filesystems on it. Life was good. ZFS is great! One of the raidz pool drives failed. When I went to replace it, I found that the V440''s original 72Gb drives had been "upgraded" to Dell 148Gb Fujitsu drives, and the Sun versions of those drives (same model number...) had different firmware, and more importantly, FEWER sectors! They were only 147.8 Gb! You know what they say about a free lunch and too good to be true... This meant that zpool replace <drive> faild because the replacement drive is too small. The question of the moment is "what to do?". All I can think of is to Attach/create a new pool that has enough space to hold the existing content, Copy the content from the old to new pools, Destroy the old pool, Recreate the old pool with the (slightly) smaller size, and copy the data back onto the pool. Given that there are a bunch of filesystems in the pool, each with some set of properties ..., what is the easiest way to move the data and metadata back and forth without losing anything, and without having to manually recreate the metainfo/properties? (adding to the ''shrink'' RFE, if I replace a pool drive with a smaller one, and the existing content is small enough to fit on a shrunk/resized pool, the zpool replace command should (after prompting) simply do the work. In this situation, losing less than 10Mb of pool space to get a healthy raidz configuration seems to be an easy tradeoff :-) TIA, -John
Matthew Ahrens
2007-May-25 22:45 UTC
[zfs-discuss] I seem to have backed myself into a corner - how do I migrate filesystems from one pool to another?
> Given that there are a bunch of filesystems in the pool, each > with some set of properties ..., what is the easiest way to > move the data and metadata back and forth without losing > anything, and without having to manually recreate the > metainfo/properties?AFAIK, your only choices are: A. Write/find a script to do the appropriate ''zfs send|recv'' and ''zfs set'' commands. B. Wait for us to implement 6421959 & 6421958 (zfs send -r / -p). I''m currently working on this, ETA at least a few months. Sorry, --matt
Will Murnane
2007-May-25 23:16 UTC
[zfs-discuss] I seem to have backed myself into a corner - how do I migrate filesystems from one pool to another?
On 5/25/07, John Plocher <John.Plocher at sun.com> wrote:> One of the raidz pool drives failed. When I went to replace it, > I found that the V440''s original 72Gb drives had been "upgraded" > to Dell 148Gb Fujitsu drives, and the Sun versions of those drives > (same model number...) had different firmware, and more importantly, > FEWER sectors! They were only 147.8 Gb! You know what they say > about a free lunch and too good to be true...What about buying a single larger drive? A 300 GB disk had better have at least 148 GB on it... It''s a few hundred bucks extra, granted, but if you have to rent or buy enough space to back everything up it might be a tossup. Will
Toby Thain
2007-May-26 03:41 UTC
[zfs-discuss] I seem to have backed myself into a corner - how do I migrate filesystems from one pool to another?
On 25-May-07, at 7:28 PM, John Plocher wrote:> ... > I found that the V440''s original 72Gb drives had been "upgraded" > to Dell 148Gb Fujitsu drives, and the Sun versions of those drives > (same model number...) had different firmwareYou can''t get hold of another one of the same drive? --Toby
Peter Eriksson
2007-May-28 16:59 UTC
[zfs-discuss] Re: I seem to have backed myself into a corner - how do I migrate filesyst
Suns disk are labeled with a standard label that are smaller than the actual disk (so that they can be interchangeble in the future). I''d first try to wipe the Sun label from the disk and have format write a new label on it... Ie: dd if=/dev/zero of=/dev/rdsk/YOURDISK bs=512 count=1024 format <select disk> y (write a label) Might work... (and might not help if Sun has modified the firmware to report less tracks/head/sectors than the retail version has). This message posted from opensolaris.org
John Plocher
2007-Jun-01 20:45 UTC
Success: Re: [zfs-discuss] Re: I seem to have backed myself into a corner - how do I migrate filesyst
I managed to correct the problem by writing a script inspired by Chris Gerhard''s blog that did a zfs send | zfs recv. Now that things are back up, I have a couple of lingering questions: 1) I noticed that the filesystem size information is not the same between the src and dst filesystem sets. Is this an expected behavior? root at sac> zfs list -r tank/projects/sac NAME USED AVAIL REFER MOUNTPOINT tank/projects/sac 49.0G 218G 48.7G /export/sac tank/projects/sac at hour_16 104M - 48.7G - tank/projects/sac at hour_17 96.7M - 48.7G - tank/projects/sac at hour_18 74.3M - 48.7G - tank/projects/sac at hour_19 18.7M - 48.7G - root at sac> zfs list -r tank2/projects/sac NAME USED AVAIL REFER MOUNTPOINT tank2/projects/sac 49.3G 110G 48.6G /export2/sac tank2/projects/sac at hour_16 99.7M - 48.6G - tank2/projects/sac at hour_17 92.3M - 48.6G - tank2/projects/sac at hour_18 70.1M - 48.6G - tank2/projects/sac at hour_19 70.7M - 48.6G - 2) Following Chris''s advice to do more with snapshots, I played with his cron-triggered snapshot routine: http://blogs.sun.com/chrisg/entry/snapping_every_minute Now, after a couple of days, zpool history shows almost 100,000 lines of output (from all the snapshots and deletions...) How can I purge or truncate this log (which has got to be taking up several Mb of space, not to mention the ever increasing sluggishness of the command...) -John Oh, here''s the script I used - it contains hardcoded zpool and zfs info, so it must be edited to match your specifics before it is used! It can be rerun safely; it only sends snapshots that haven''t already been sent so that I could do the initial time-intensive copies while the system was still in use and only have to do a faster "resync" while down in single user mode. It isn''t pretty (it /is/ a perl script) but it worked :-) -------------------------- #!/usr/bin/perl # John Plocher - May, 2007 # ZFS helper script to replicate the filesystems+snapshots in # SRCPOOL onto a new DSTPOOL that was a different size. # # Historical situation: # + zpool create tank raidz c1t1d0 c1t2d0 c1t3d0 # + zfs create tank/projects # + zfs set mountpoint=/export tank/projects # + zfs set sharenfs=on tank/projects # + zfs create tank/projects/... # ... fill up the above with data... # Drive c1t3d0 FAILED # + zpool offline tank c1t3d0 # ... find out that replacement drive is 10,000 sectors SMALLER # ... than the original, and zpool replace won''t work with it. # # Usage Model: # Create a new (temp) pool large enough to hold all the data # currently on tank # + zpool create tank2 c2t2d0 c2t3d0 c2t4do # + zfs set mountpoint=/export2 tank2/projects # Set a baseline snapshot on tank # + zfs snapshot -r tank at bulk # Edit and run this script to copy the data + filesystems from tank to # the new pool tank2 # + ./copyfs # Drop to single user mode, unshare the tank filesystems, # + init s # + zfs unshare tank # Shut down apache, cron and sendmail # + svcadm disable svc:/network/http:cswapache2 # + svcadm disable svc:/system/cron:default # + svcadm disable svc:/network/smtp:sendmail # Take another snapshot, # + zfs snapshot -r tank at now # Rerun script to catch recent changes # + ./copyfs # Verify that the copies were successful, # + dircmp -s /export/projects /export2/projects # + zfs destroy tank # + zpool create tank raidz c1t1d0 c1t2d0 c1t3d0 # Modify script to reverse transfer and set properties, then # run script to recreate tank''s filesystems, # + ./copyfs # Reverify that content is still correct # + dircmp -s /export/projects /export2/projects # Re-enable cron, http and mail # + svcadm enable svc:/network/http:cswapache2 # + svcadm enable svc:/system/cron:default # + svcadm enable svc:/network/smtp:sendmail # Go back to multiuser # + init 3 # Reshare filesystems. # + zfs share tank # Go home and get some sleep # $SRCPOOL="tank"; $DSTPOOL="tank2"; # Set various properties once the initial filesystem is recv''d... # (Uncomment these when copying the filesystems back to their original pool) # $props{"projects"} = (); # push( @{ $props{"projects"} }, ("zfs set mountpoint=/export tank/projects")); # push( @{ $props{"projects"} }, ("zfs set sharenfs=on tank/projects")); # $props{"projects/viper"} = (); # push( @{ $props{"projects/viper"} }, ("zfs set sharenfs=rw=bk-test:eressea:scuba:sac:viper:caboose,root=sac:viper:caboose,ro tank/projects/viper")); sub getsnapshots(@) { my (@filesystems) = @_; my @snaps; my @snapshots; foreach my $fs ( @filesystems ) { chomp($fs); next if ($fs eq $SRCPOOL); # print "Filesystem: $fs\n"; # Get a list of all snapshots in this filesystem @snapshots = split /^/, `zfs list -Hr -t snapshot -o name -s creation $fs`; foreach my $dataset ( @snapshots ) { chomp($dataset); my ($dpool, $dsnapshot) = split(/\//, $dataset,2); my ($dfs, $dtag) = split(/@/, $dsnapshot,2); next if ($fs ne "$dpool/$dfs"); next if ($dtag =~ /^minute_/); # print " Dataset=$dataset, P=$dpool, S=$dsnapshot, FS=$dfs, TAG=$dtag\n"; push (@snaps, ($dataset)); } } return @snaps; } # Get a list of all filesystems in the SRC pool @src_snaps = &getsnapshots(split /^/, `zfs list -Hr -t filesystem -o name $SRCPOOL`); # Get a list of all filesystems in the DST pool @dst_snaps = &getsnapshots(split /^/, `zfs list -Hr -t filesystem -o name $DSTPOOL`); # Mark snapshots that have already been sent... foreach my $dataset ( @dst_snaps ) { ($pool, $snapshot) = split(/\//, $dataset,2); ($fs, $tag) = split(/@/, $snapshot,2); $last{$fs} = $snapshot; # keep track of the last one sent $dst{$fs}{$tag} = $pool; } # only send snaps that have not already been sent foreach $dataset ( @src_snaps ) { ($pool, $snapshot) = split(/\//, $dataset,2); ($fs, $tag) = split(/@/, $snapshot,2); if (!defined($dst{$fs}{$tag})) { push (@snaps, ($dataset)); } } # do the work... if ($#snaps == -1) { print("Warning: No uncopied snapshots found in pool $SRCPOOL\n"); } else { # copy them over to the new pool $last_fs = ""; foreach $dataset ( @snaps ) { ($pool, $snapshot) = split(/\//, $dataset,2); ($fs, $tag) = split(/@/, $snapshot,2); if ($fs ne $last_fs) { $last_snapshot = undef; print "Filesystem: $fs\n"; $last_fs = $fs; } # print "accepted: P=$pool, FS=$fs, TAG=$tag\n"; @cmd = (); if ( !defined($last_snapshot) ) { if ( defined($last{$fs} ) ) { push(@cmd, ("zfs send -i $SRCPOOL/$last{$fs} $dataset | zfs recv $DSTPOOL/$fs")); } else { push(@cmd, ("zfs send $dataset | zfs recv $DSTPOOL/$fs")); } # If any properties need to be set on this filesystem, do so # after the initial dataset has been copied over... if ( defined($props{$fs} ) ) { foreach my $c ( @{ $props{$fs} }) { push(@cmd, ($c)); } } } else { push(@cmd, ("zfs send -i $last_snapshot $dataset | zfs recv $DSTPOOL/$fs")); } foreach $cmd ( @cmd ) { print " + $cmd\n"; system($cmd); } $last_snapshot = $dataset; } } ----------------------------
eric kustarz
2007-Jun-01 20:54 UTC
Success: Re: [zfs-discuss] Re: I seem to have backed myself into a corner - how do I migrate filesyst
> > 2) Following Chris''s advice to do more with snapshots, I > played with his cron-triggered snapshot routine: > http://blogs.sun.com/chrisg/entry/snapping_every_minute > > Now, after a couple of days, zpool history shows almost > 100,000 lines of output (from all the snapshots and > deletions...) > > How can I purge or truncate this log (which has got to be > taking up several Mb of space, not to mention the ever > increasing sluggishness of the command...) >You can check out the comment at the head of spa_history.c: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/ common/fs/zfs/spa_history.c The history is implemented as a ring buffer (where the size is MIN (32MB, 1 %of your capacity)): http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/ common/fs/zfs/spa_history.c#105 We specifically didn''t allow the admin the ability to truncate/prune the log as then it becomes unreliable - ooops i made a mistake, i better clear the log and file the bug against zfs .... eric
John Plocher
2007-Jun-01 21:09 UTC
Success: Re: [zfs-discuss] Re: I seem to have backed myself into a corner - how do I migrate filesyst
eric kustarz wrote:> We specifically didn''t allow the admin the ability to truncate/prune the > log as then it becomes unreliable - ooops i made a mistake, i better > clear the log and file the bug against zfs ....I understand - auditing means never getting to blame someone else :-) There are things in the log that are (IMhO, and In My Particular Case) more important than others. Snapshot creations & deletions are "noise" compared with filesystem creations, property settings, etc. This seems especially true when there is closure on actions - the set of zfs snapshot foo/bar at now zfs destroy foo/bar at now commands is (except for debugging zfs itself) a noop.... Looking at history.c, it doesn''t look like there is an easy way to mark a set of messages as "unwanted" and compress the log without having to take the pool out of service first. Oh well... -John
Nicolas Williams
2007-Jun-01 21:15 UTC
Success: Re: [zfs-discuss] Re: I seem to have backed myself into a corner - how do I migrate filesyst
On Fri, Jun 01, 2007 at 02:09:55PM -0700, John Plocher wrote:> eric kustarz wrote: > >We specifically didn''t allow the admin the ability to truncate/prune the > >log as then it becomes unreliable - ooops i made a mistake, i better > >clear the log and file the bug against zfs .... > > I understand - auditing means never getting to blame someone else :-) > > There are things in the log that are (IMhO, and In My Particular Case) > more important than others. Snapshot creations & deletions are "noise" > compared with filesystem creations, property settings, etc.But clone creation == filesystem creation, and since you can only clone snapshots you''d want snapshotting included in the log, at least the ones referenced by live clones. Or if there was a pivot and the old fs and snapshot were destroyed you might still want to know about that. I think there has to be a way to truncate/filter the log, at least by date.> This seems especially true when there is closure on actions - the set of > zfs snapshot foo/bar at now > zfs destroy foo/bar at now > commands is (except for debugging zfs itself) a noop....Yes, but it could be very complicated: zfs snapshot foo/bar at now zfs clone foo/bar at now foo/bar-then zfs clone foo/bar at now foo/bar-then-again zfs snapshot foo/bar-then at now zfs clone foo/bar-then at now foo/bar-then-and-then zfs destroy -r foo/bar at now
eric kustarz
2007-Jun-01 21:16 UTC
Success: Re: [zfs-discuss] Re: I seem to have backed myself into a corner - how do I migrate filesyst
On Jun 1, 2007, at 2:09 PM, John Plocher wrote:> eric kustarz wrote: >> We specifically didn''t allow the admin the ability to truncate/ >> prune the log as then it becomes unreliable - ooops i made a >> mistake, i better clear the log and file the bug against zfs .... > > > I understand - auditing means never getting to blame someone else :-):)> > There are things in the log that are (IMhO, and In My Particular Case) > more important than others. Snapshot creations & deletions are > "noise" > compared with filesystem creations, property settings, etc. > > This seems especially true when there is closure on actions - the > set of > zfs snapshot foo/bar at now > zfs destroy foo/bar at now > commands is (except for debugging zfs itself) a noop.... > > Looking at history.c, it doesn''t look like there is an easy > way to mark a set of messages as "unwanted" and compress the log > without having to take the pool out of service first.Right, you''ll have to do any post-processing yourself (something like a script + cron job). eric
Mark J Musante
2007-Jun-01 21:22 UTC
Success: Re: [zfs-discuss] Re: I seem to have backed myself into a corner - how do I migrate filesyst
On Fri, 1 Jun 2007, John Plocher wrote:> This seems especially true when there is closure on actions - the set of > zfs snapshot foo/bar at now > zfs destroy foo/bar at now > commands is (except for debugging zfs itself) a noop....Note that if you use the recursive snapshot and destroy, only one line is entered into the history for all filesystems. Regards, markm
John Plocher
2007-Jun-01 21:46 UTC
Success: Re: [zfs-discuss] Re: I seem to have backed myself into a corner - how do I migrate filesyst
Mark J Musante wrote:> Note that if you use the recursive snapshot and destroy, only one line isMy "problem" (and it really is /not/ an important one) was that I had a cron job that every minute did min=`date "+%d"` snap="$pool/$filesystem at minute_$min" zfs destroy "$snap" zfs snapshot "$snap" and, after a couple of days (at 86 thousand minutes/day), the pool''s history log seemed quite full (but not at capacity...) There were no clones to complicate things... -John