thr3ads.net - zfs discuss - [zfs-discuss] Migrate ZFS volume to new pool [Apr 2010]

If this information is useful, please help other people find it:
Share via:

Wolfraider

2010-Apr-27 21:06 UTC

[zfs-discuss] Migrate ZFS volume to new pool

We would like to delete and recreate our existing zfs pool without losing any
data. The way we though we could do this was attach a few HDDs and create a new
temporary pool, migrate our existing zfs volume to the new pool, delete and
recreate the old pool and migrate the zfs volumes back. The big problem we have
is we need to do all this live, without any downtime. We have 2 volumes taking
up around 11TB and they are shared out to a couple windows servers with comstar.
Anyone have any good ideas?
-- 
This message posted from opensolaris.org

Cindy Swearingen

2010-Apr-27 21:45 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

Hi Wolf,

Which Solaris release is this?

If it is an OpenSolaris system running a recent build, you might
consider the zpool split feature, which splits a mirrored pool into two
separate pools, while the original pool is online.

If possible, attach the spare disks to create the mirrored pool as
a first step.

See the example below.

Thanks,

Cindy

You can attach the spare disks to the existing pool to create the
mirrored pool:

# zpool attach tank disk-1 spare-disk-1
# zpool attach tank disk-2 spare-disk-2

Which gives you a pool like this:

# zpool status tank
   pool: tank
  state: ONLINE
  scrub: resilver completed after 0h0m with 0 errors on Tue Apr 27 
14:36:28 2010
config:

         NAME         STATE     READ WRITE CKSUM
         tank         ONLINE       0     0     0
           mirror-0   ONLINE       0     0     0
             c2t9d0   ONLINE       0     0     0
             c2t5d0   ONLINE       0     0     0
           mirror-1   ONLINE       0     0     0
             c2t10d0  ONLINE       0     0     0
             c2t6d0   ONLINE       0     0     0  56.5K resilvered

errors: No known data errors


Then, split the mirrored pool, like this:

# zpool split tank tank2
# zpool import tank2
# zpool status tank tank2
   pool: tank
  state: ONLINE
  scrub: resilver completed after 0h0m with 0 errors on Tue Apr 27 
14:36:28 2010
config:

         NAME        STATE     READ WRITE CKSUM
         tank        ONLINE       0     0     0
           c2t9d0    ONLINE       0     0     0
           c2t10d0   ONLINE       0     0     0

errors: No known data errors

   pool: tank2
  state: ONLINE
  scrub: none requested
config:

         NAME        STATE     READ WRITE CKSUM
         tank2       ONLINE       0     0     0
           c2t5d0    ONLINE       0     0     0
           c2t6d0    ONLINE       0     0     0



On 04/27/10 15:06, Wolfraider wrote:> We would like to delete and recreate our existing zfs pool without losing
any data. The way we though we could do this was attach a few HDDs and create a
new temporary pool, migrate our existing zfs volume to the new pool, delete and
recreate the old pool and migrate the zfs volumes back. The big problem we have
is we need to do all this live, without any downtime. We have 2 volumes taking
up around 11TB and they are shared out to a couple windows servers with comstar.
Anyone have any good ideas?

Jim Horng

2010-Apr-27 22:11 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

Unclear what you want to do?  What''s the goal for this excise?
If you want to replace the pool with larger disks and the pool is in mirror or
raidz.  You just replace one disk at a time and allow the pool to rebuild it
self.  Once all the disk has been replace, it will atomically realize the disk
increase and expand the pool.
-- 
This message posted from opensolaris.org

Wolfraider

2010-Apr-28 13:37 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

The original drive pool was configured with 144 1TB drives and a hardware raid 0
strip across every 4 drives to create 4TB luns. These luns where then combined
into 6 raidz2 luns and added to the zfs pool. I would like to delete the
original hardware raid 0 stripes and add the 144 drives directly to the zfs
pool. This should improve performance considerably since we are not doing a raid
on top of a raid and fix the whole stripe size issue. Since this pool will be
deleted and recreated, I will need to move the data off to something else.
-- 
This message posted from opensolaris.org

Wolfraider

2010-Apr-28 13:40 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

We are running the latest dev release.

I was hoping to just mirror the zfs voumes and not the whole pool. The original
pool size is around 100TB in size. The spare disks I have come up with will
total around 40TB. We only have 11TB of space in use on the original zfs pool.
-- 
This message posted from opensolaris.org

Richard Elling

2010-Apr-28 15:16 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

On Apr 28, 2010, at 6:40 AM, Wolfraider wrote:
> We are running the latest dev release.
> 
> I was hoping to just mirror the zfs voumes and not the whole pool. The
original pool size is around 100TB in size. The spare disks I have come up with
will total around 40TB. We only have 11TB of space in use on the original zfs
pool.
Mirrors are made with vdevs (LUs or disks), not pools.  However, the
vdev attached to a mirror must be the same size (or nearly so) as the
original. If the original vdevs are 4TB, then a migration to a pool made
with 1TB vdevs cannot be done by replacing vdevs (mirror method).
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com

Richard Elling

2010-Apr-28 15:20 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

On Apr 28, 2010, at 6:37 AM, Wolfraider wrote:> The original drive pool was configured with 144 1TB drives and a hardware
raid 0 strip across every 4 drives to create 4TB luns.
For the archives, this is not a good idea...
> These luns where then combined into 6 raidz2 luns and added to the zfs
pool.
... even when data protection is achieved at a higher level. 
See also RAID-0+1 vs RAID-1+0.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com

Wolfraider

2010-Apr-28 15:36 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

> On Apr 28, 2010, at 6:37 AM, Wolfraider wrote:
> > The original drive pool was configured with 144 1TB
> drives and a hardware raid 0 strip across every 4
> drives to create 4TB luns.
> 
> For the archives, this is not a good idea...
Exactly, This is the reason I want to blow all the old configuration away. :)
-- 
This message posted from opensolaris.org

Wolfraider

2010-Apr-28 15:39 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

> Mirrors are made with vdevs (LUs or disks), not
> pools.  However, the
> vdev attached to a mirror must be the same size (or
> nearly so) as the
> original. If the original vdevs are 4TB, then a
> migration to a pool made
> with 1TB vdevs cannot be done by replacing vdevs
> (mirror method).
>  -- richard
Both luns that we are sharing out with comstar are vdevs. It sounds like we can
create the new temporary pool, create a couple new luns the same size as the old
ones and then create mirrors between the two. Wait until it is synced and break
the mirror. This is what we were thinking we could do, just wanted to make sure.
-- 
This message posted from opensolaris.org

Richard Elling

2010-Apr-28 17:32 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

On Apr 28, 2010, at 8:39 AM, Wolfraider wrote:
>> Mirrors are made with vdevs (LUs or disks), not
>> pools.  However, the
>> vdev attached to a mirror must be the same size (or
>> nearly so) as the
>> original. If the original vdevs are 4TB, then a
>> migration to a pool made
>> with 1TB vdevs cannot be done by replacing vdevs
>> (mirror method).
>> -- richard
> 
> Both luns that we are sharing out with comstar are vdevs. It sounds like we
can create the new temporary pool, create a couple new luns the same size as the
old ones and then create mirrors between the two. Wait until it is synced and
break the mirror. This is what we were thinking we could do, just wanted to make
sure.
This can work, and you create make the temporary iSCSI targets as 
compressed, sparse volumes.  If your 100TB of data will squeeze into 
40TB, then it is just a matter of time to copy.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com

Jim Horng

2010-Apr-28 18:03 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

For this type of migration a downtime is required.  However, it can be reduce to
only a few hours to a few minutes depending how much change need to be synced.

I have done this many times on a NetApp Filer but can be apply to zfs as well.

First thing is consider is only do the migration once so you don''t need
two downtime.
Let talk about migration first
1. You will need a recent enough zfs to support zfs send and recive.
2. create your destination pool (there is things you can do here to avoid
migrating back).
3. create you destination volume
4. create snapshot snap1 of the source volume  (zfs snapshot)
5. use zfs send <volume at snapshot1> | zfs receive <dstvol> (this
will sync most of the 11 TB and may take days)
6 create snapshot2 snap2 of the source volume 
7. incremental sync the snapshot with zfs send -i <volume at snapshot1>
<volume at snapshot2> | <volume at snapshot1> (this should be
faster).
repeat 6 and 7 as needed to get the sync time to be about the allowed downtime.
8 ** DOWNTIME ** Turn off the windows Servers
9 zfs unmount the source volume to ensure no more change to the volume
10 create snapshot final of the source volume
11 incremental sync the final snapshot
12 rename the source volume to backup volume (you can rename pools via import
export)
13 rename the destvol to production
14 mount product destvol. (reconfigure what you need for comstar)
15 Turn on windows server 
16 You need to have some way of verifying the migration and blackout if needed. 
Once verify enable the window services  ** END of DOWNTIME **
17 you should have a backup of the old volume before destroy the old pool
18 Destroy the pool.
19 Add the now spare disks into the new pool 

No Downtime is not possible because you need to switch pool and zfs
don''t currently support features like pvmove, vgsplit, vgmerge, and
vgreduce in lvm.
-- 
This message posted from opensolaris.org

Jim Horng

2010-Apr-28 18:28 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

So on the point of not need an migration back.
Even at 144 disk.  they won''t be on the same raid group.  So figure out
what is the best raid group size for you since zfs don''t support
changing number of disk in raidz yet.  I usually use the number of the slots per
shelf. or a good number is 7~10 for raidz1 and 10~20 for raidz2 or raidz3

Create the desvol with that optimized number of disks in one group (or two).
do the migration. the once the old volume is destroyed, just added them to the
new pool.  remember you can expend pool by a raid group at a time just not
change the raid group size after that.
-- 
This message posted from opensolaris.org

Bob Friesenhahn

2010-Apr-28 18:47 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

On Wed, 28 Apr 2010, Jim Horng wrote:
> So on the point of not need an migration back.
> Even at 144 disk.  they won''t be on the same raid group.  So
figure
> out what is the best raid group size for you since zfs don''t
support
> changing number of disk in raidz yet.  I usually use the number of 
> the slots per shelf. or a good number is 7~10 for raidz1 and 10~20 
> for raidz2 or raidz3
Good luck with the 20 disks in raidz2 or raidz3.

If you are going to base the number of disks per vdev on the 
shelf/rack layout then it makes more sense to base it on the number of 
disk shelves/controllers than it does the number of slots per shelf. 
You would want to distribute the disks in your raidz2 vdevs across the 
shelves rather than devote a shelf to one raidz2 vdev.  At least that 
is what you would do if you care about performance and reliability. 
If a shelf dies, your pool should remain alive.  Likewise, there is 
likely more I/O bandwidth available if the vdevs are spread across 
controllers.

Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Jim Horng

2010-Apr-28 18:51 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

Sorry, I need to correct myself.  Mirror luns on the windows side to switch
storage pool under it is a great idea and I think you can do this without
downtime.
-- 
This message posted from opensolaris.org

Wolfraider

2010-Apr-28 18:56 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

3 shelves with 2 controllers each. 48 drive per shelf. These are Fibrechannel
attached. We would like all 144 drives added to the same large pool.
-- 
This message posted from opensolaris.org

Jim Horng

2010-Apr-28 19:05 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

I understand your point. however in most production system the selves are added
incrementally so make sense to be related to number of slots per shelf.  and in
most case withstand a shelf failure is to much of overhead on storage any are. 
for example in his case he will have to configure 1+0 raid with only two
shelves. i.e. 50% overhead.
-- 
This message posted from opensolaris.org

Bob Friesenhahn

2010-Apr-28 22:01 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

On Wed, 28 Apr 2010, Jim Horng wrote:
> I understand your point. however in most production system the 
> selves are added incrementally so make sense to be related to number 
> of slots per shelf.  and in most case withstand a shelf failure is 
> to much of overhead on storage any are.  for example in his case he 
> will have to configure 1+0 raid with only two shelves. i.e. 50% 
> overhead.
Yes, I can see that with 48 drives per shelf, the opportunities for 
creative natural fault isolation are less available.  It is also true 
that often hardware is added incrementally.

A strong argument can be made that smaller less capable simplex-routed 
shelves may be a more cost effective and reliable solution when used 
carefully with zfs.  For example, mini-shelves which support 8 drives 
each.

Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Jim Horng

2010-Apr-29 04:48 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

> 3 shelves with 2 controllers each. 48 drive per
> shelf. These are Fibrechannel attached. We would like
> all 144 drives added to the same large pool.
I would do either a 12 or 16 disk raidz3 vdev and do spread out the disk across
controllers within vdevs. also may want to leave a least 1 spare disk per shelf.
(i.e. some vdevs with one less disk).
Just my 2 cents
-- 
This message posted from opensolaris.org

Richard Elling

2010-Apr-29 04:52 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

On Apr 28, 2010, at 9:48 PM, Jim Horng wrote:
>> 3 shelves with 2 controllers each. 48 drive per
>> shelf. These are Fibrechannel attached. We would like
>> all 144 drives added to the same large pool.
> 
> I would do either a 12 or 16 disk raidz3 vdev and do spread out the disk
across controllers within vdevs. also may want to leave a least 1 spare disk per
shelf. (i.e. some vdevs with one less disk).
Why would you recommend a spare for raidz2 or raidz3?
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com

Jim Horng

2010-Apr-29 06:39 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

> Why would you recommend a spare for raidz2 or raidz3?
>  -- richard
Spare is to minimize the reconstruction time.  Because remember a vdev can not
start resilvering until there is a spare disk available. And with disks as big
as they are today, resilvering also take many hours.  I rather have the disk
finished resilvering before I have the chance to replace the bad disk than to
risk more disks fail before It had a chance to resilverize.

This is especially important if the file system is not at a location with 24
hours staff.
-- 
This message posted from opensolaris.org

Bob Friesenhahn

2010-Apr-29 22:20 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

On Wed, 28 Apr 2010, Jim Horng wrote:
>> Why would you recommend a spare for raidz2 or raidz3?
>
> Spare is to minimize the reconstruction time.  Because remember a 
> vdev can not start resilvering until there is a spare disk 
> available. And with disks as big as they are today, resilvering also 
> take many hours.  I rather have the disk finished resilvering before 
> I have the chance to replace the bad disk than to risk more disks 
> fail before It had a chance to resilverize.
Would your opinion change if the disks you used took 7 days to 
resilver?

Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Jim Horng

2010-Apr-29 22:56 UTC

head link

[zfs-discuss] Migrate ZFS volume to new pool

> Would your opinion change if the disks you used took
> 7 days to resilver?
> 
> Bob
That will only make a stronger case that hot spare is absolutely needed.
This will also make a strong case for choosing raidz3 over raidz2 as well as
vdev smaller number of disks.
-- 
This message posted from opensolaris.org

zfs discuss - Apr 2010 - Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool

[zfs-discuss] Migrate ZFS volume to new pool