Tomas Ögren
2009-Oct-19 18:41 UTC
[zfs-discuss] Interesting bug with picking labels when expanding a slice where a pool lives
Hi. We''ve got some test machines which amongst others has zpools in various sizes and placements scribbled all over the disks. 0. HP DL380G3, Solaris10u8, 2x16G disks; c1t0d0 & c1t1d0 1. Took a (non-emptied) disk, created a 2GB slice0 and a ~14GB (to the last cyl) slice7. 2. zpool create striclek c1t1d0s0 3. zdb -l /dev/rdsk/c1t1d0s0 shows 4 labels, each with the same guid and only c1t1d0s0 as vdev. All is well. 4. format, increase slice0 from 2G to 16G. remove slice7. label. 5. zdb -l /dev/rdsk/c1t1d0s0 shows 2 labels from the correct guid & c1t1d0s0, it also shows 2 labels from some old guid (from an rpool which was abandoned long ago) belonging to a mirror(c1t0d0s0,c1t1d0s0). c1t0d0s0 is current boot disk with other rpool and other guid. 6. zpool export striclek;zpool import shows guid from the "working pool", but that it''s missing devices (although only lists c1t1d0s0 - ONLINE) 7. zpool import striclek doesn''t work. zpool import theworkingguid doesn''t work. If I resize the slice back to 2GB, all 4 labels shows the workingguid and import works again. Questions: * Why does ''zpool import'' show the guid from label 0/1, but wants vdev conf as specified by label 2/3? * Is there no timestamp or such, so it would prefer label 0/1 as they are brand new and ignore label 2/3 which are waaay old. I can agree to being forced to scribble zeroes/junk all over the "slice7 space" which we''re expanding to in step 4.. But stuff shouldn''t fail this way IMO.. Maybe comparing timestamps and see that label 2/3 aren''t so hot anymore and ignore them, or something.. zdb -l and zpool import dumps at: http://www.acc.umu.se/~stric/tmp/zdb-dump/ /Tomas -- Tomas ?gren, stric at acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Ume? `- Sysadmin at {cs,acc}.umu.se
Cindy Swearingen
2009-Oct-19 19:20 UTC
[zfs-discuss] Interesting bug with picking labels when expanding a slice where a pool lives
Hi Tomas, I think you are saying that you are testing what happens when you increase a slice under a live ZFS storage pool and then reviewing the zdb output of the disk labels. Increasing a slice under a live ZFS storage pool isn''t supported and might break your pool. I think you are seeing some remnants of some old pools on your slices with zdb since this is how zpool import is able to import pools that have been destroyed. Maybe I missed the point. Let me know. Cindy On 10/19/09 12:41, Tomas ?gren wrote:> Hi. > > We''ve got some test machines which amongst others has zpools in various > sizes and placements scribbled all over the disks. > > 0. HP DL380G3, Solaris10u8, 2x16G disks; c1t0d0 & c1t1d0 > 1. Took a (non-emptied) disk, created a 2GB slice0 and a ~14GB (to the > last cyl) slice7. > 2. zpool create striclek c1t1d0s0 > 3. zdb -l /dev/rdsk/c1t1d0s0 shows 4 labels, each with the same guid > and only c1t1d0s0 as vdev. All is well. > 4. format, increase slice0 from 2G to 16G. remove slice7. label. > 5. zdb -l /dev/rdsk/c1t1d0s0 shows 2 labels from the correct guid & > c1t1d0s0, it also shows 2 labels from some old guid (from an rpool > which was abandoned long ago) belonging to a > mirror(c1t0d0s0,c1t1d0s0). c1t0d0s0 is current boot disk with other > rpool and other guid. > 6. zpool export striclek;zpool import shows guid from the "working > pool", but that it''s missing devices (although only lists c1t1d0s0 - > ONLINE) > 7. zpool import striclek doesn''t work. zpool import theworkingguid > doesn''t work. > > If I resize the slice back to 2GB, all 4 labels shows the workingguid > and import works again. > > Questions: > * Why does ''zpool import'' show the guid from label 0/1, but wants vdev > conf as specified by label 2/3? > * Is there no timestamp or such, so it would prefer label 0/1 as they > are brand new and ignore label 2/3 which are waaay old. > > > I can agree to being forced to scribble zeroes/junk all over the "slice7 > space" which we''re expanding to in step 4.. But stuff shouldn''t fail > this way IMO.. Maybe comparing timestamps and see that label 2/3 aren''t > so hot anymore and ignore them, or something.. > > zdb -l and zpool import dumps at: > http://www.acc.umu.se/~stric/tmp/zdb-dump/ > > /Tomas
Tomas Ögren
2009-Oct-19 20:18 UTC
[zfs-discuss] Interesting bug with picking labels when expanding a slice where a pool lives
On 19 October, 2009 - Cindy Swearingen sent me these 2,4K bytes:> Hi Tomas, > > I think you are saying that you are testing what happens when you > increase a slice under a live ZFS storage pool and then reviewing > the zdb output of the disk labels. > > Increasing a slice under a live ZFS storage pool isn''t supported and > might break your pool.It also happens on a non-live pool, that is, if I export, increase the slice and then try to import. root at ramses:~# zpool export striclek root at ramses:~# format Searching for disks...done ... increase c1t1d0s0 .... root at ramses:~# zpool import striclek cannot import ''striclek'': one or more devices is currently unavailable .. which is the way to increase a pool within a disk/device if I''m not mistaken.. Like if the storage comes off a SAN and you resize the LUN..> I think you are seeing some remnants of some old pools on your slices > with zdb since this is how zpool import is able to import pools that > have been destroyed.Yep, that''s exactly what I see. The issue is that the new&good labels aren''t trusted anymore (it also looks at old ones) and also that "zpool import" picks information from different labels and presents it as one piece of info. If I was using some SAN and my lun got increased, and the new storage space had some old scrap data on it, I could get hit by the same issue.> Maybe I missed the point. Let me know.> > Cindy > > On 10/19/09 12:41, Tomas ?gren wrote: >> Hi. >> >> We''ve got some test machines which amongst others has zpools in various >> sizes and placements scribbled all over the disks. >> >> 0. HP DL380G3, Solaris10u8, 2x16G disks; c1t0d0 & c1t1d0 >> 1. Took a (non-emptied) disk, created a 2GB slice0 and a ~14GB (to the >> last cyl) slice7. >> 2. zpool create striclek c1t1d0s0 >> 3. zdb -l /dev/rdsk/c1t1d0s0 shows 4 labels, each with the same guid >> and only c1t1d0s0 as vdev. All is well. >> 4. format, increase slice0 from 2G to 16G. remove slice7. label. >> 5. zdb -l /dev/rdsk/c1t1d0s0 shows 2 labels from the correct guid & >> c1t1d0s0, it also shows 2 labels from some old guid (from an rpool >> which was abandoned long ago) belonging to a >> mirror(c1t0d0s0,c1t1d0s0). c1t0d0s0 is current boot disk with other >> rpool and other guid. >> 6. zpool export striclek;zpool import shows guid from the "working >> pool", but that it''s missing devices (although only lists c1t1d0s0 - >> ONLINE) >> 7. zpool import striclek doesn''t work. zpool import theworkingguid >> doesn''t work. >> >> If I resize the slice back to 2GB, all 4 labels shows the workingguid >> and import works again. >> >> Questions: >> * Why does ''zpool import'' show the guid from label 0/1, but wants vdev >> conf as specified by label 2/3? >> * Is there no timestamp or such, so it would prefer label 0/1 as they >> are brand new and ignore label 2/3 which are waaay old. >> >> >> I can agree to being forced to scribble zeroes/junk all over the "slice7 >> space" which we''re expanding to in step 4.. But stuff shouldn''t fail >> this way IMO.. Maybe comparing timestamps and see that label 2/3 aren''t >> so hot anymore and ignore them, or something.. >> >> zdb -l and zpool import dumps at: >> http://www.acc.umu.se/~stric/tmp/zdb-dump/ >> >> /Tomas > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss/Tomas -- Tomas ?gren, stric at acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Ume? `- Sysadmin at {cs,acc}.umu.se
Cindy Swearingen
2009-Oct-19 21:36 UTC
[zfs-discuss] Interesting bug with picking labels when expanding a slice where a pool lives
Hi Tomas, Increasing the slice size in a pool by using the format utility is not equivalent to increasing a LUN size. Increasing a LUN size triggers a sysevent from the underlying device that ZFS recognizes. The autoexpand feature takes advantage of this mechanism. I don''t know if a bug is here, but I will check. A workaround in the mean time might be to use whole disks instead of slices on non-root pools. Take a look at the autoexpand feature in the SXCE, build 117 release. You can read about it here: http://docs.sun.com/app/docs/doc/817-2271/githb?a=view Cindy On 10/19/09 14:18, Tomas ?gren wrote:> On 19 October, 2009 - Cindy Swearingen sent me these 2,4K bytes: > >> Hi Tomas, >> >> I think you are saying that you are testing what happens when you >> increase a slice under a live ZFS storage pool and then reviewing >> the zdb output of the disk labels. >> >> Increasing a slice under a live ZFS storage pool isn''t supported and >> might break your pool. > > It also happens on a non-live pool, that is, if I export, increase the > slice and then try to import. > root at ramses:~# zpool export striclek > root at ramses:~# format > Searching for disks...done > ... increase c1t1d0s0 .... > root at ramses:~# zpool import striclek > cannot import ''striclek'': one or more devices is currently unavailable > > .. which is the way to increase a pool within a disk/device if I''m not > mistaken.. Like if the storage comes off a SAN and you resize the LUN.. > >> I think you are seeing some remnants of some old pools on your slices >> with zdb since this is how zpool import is able to import pools that >> have been destroyed. > > Yep, that''s exactly what I see. The issue is that the new&good labels > aren''t trusted anymore (it also looks at old ones) and also that "zpool > import" picks information from different labels and presents it as one > piece of info. > > If I was using some SAN and my lun got increased, and the new storage > space had some old scrap data on it, I could get hit by the same issue. > >> Maybe I missed the point. Let me know. > >> Cindy >> >> On 10/19/09 12:41, Tomas ?gren wrote: >>> Hi. >>> >>> We''ve got some test machines which amongst others has zpools in various >>> sizes and placements scribbled all over the disks. >>> >>> 0. HP DL380G3, Solaris10u8, 2x16G disks; c1t0d0 & c1t1d0 >>> 1. Took a (non-emptied) disk, created a 2GB slice0 and a ~14GB (to the >>> last cyl) slice7. >>> 2. zpool create striclek c1t1d0s0 >>> 3. zdb -l /dev/rdsk/c1t1d0s0 shows 4 labels, each with the same guid >>> and only c1t1d0s0 as vdev. All is well. >>> 4. format, increase slice0 from 2G to 16G. remove slice7. label. >>> 5. zdb -l /dev/rdsk/c1t1d0s0 shows 2 labels from the correct guid & >>> c1t1d0s0, it also shows 2 labels from some old guid (from an rpool >>> which was abandoned long ago) belonging to a >>> mirror(c1t0d0s0,c1t1d0s0). c1t0d0s0 is current boot disk with other >>> rpool and other guid. >>> 6. zpool export striclek;zpool import shows guid from the "working >>> pool", but that it''s missing devices (although only lists c1t1d0s0 - >>> ONLINE) >>> 7. zpool import striclek doesn''t work. zpool import theworkingguid >>> doesn''t work. >>> >>> If I resize the slice back to 2GB, all 4 labels shows the workingguid >>> and import works again. >>> >>> Questions: >>> * Why does ''zpool import'' show the guid from label 0/1, but wants vdev >>> conf as specified by label 2/3? >>> * Is there no timestamp or such, so it would prefer label 0/1 as they >>> are brand new and ignore label 2/3 which are waaay old. >>> >>> >>> I can agree to being forced to scribble zeroes/junk all over the "slice7 >>> space" which we''re expanding to in step 4.. But stuff shouldn''t fail >>> this way IMO.. Maybe comparing timestamps and see that label 2/3 aren''t >>> so hot anymore and ignore them, or something.. >>> >>> zdb -l and zpool import dumps at: >>> http://www.acc.umu.se/~stric/tmp/zdb-dump/ >>> >>> /Tomas >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > /Tomas